reseaux de neuronnes artificiel et algorithmes d'appreseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.reseaux de neuronnes artificiel et algorithmes d'apprentissage.rentissage.
This document discusses characterizing polymeric membranes under large deformations using an artificial neural network model. It presents an experimental study of blowing circular thermoplastic ABS membranes using free blowing technique. A multilayer neural network is used to model the non-linear behavior of the membrane under biaxial deformation. The neural network results are compared to experimental data and a finite difference model using a hyperelastic Mooney-Rivlin model. The neural network accurately reproduces the membrane behavior with minimal error margins compared to experimental measurements.
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...CSCJournals
The document compares the Levenberg-Marquardt and Scaled Conjugate Gradient algorithms for training a multilayer perceptron neural network for image compression. It finds that while both algorithms performed comparably in terms of accuracy and speed, the Levenberg-Marquardt algorithm achieved slightly better accuracy as measured by average training accuracy and mean squared error, while the Scaled Conjugate Gradient algorithm was faster as measured by average training iterations. The document compresses a standard test image called Lena using both algorithms and analyzes the results.
We propose an algorithm for training Multi Layer Preceptrons for classification problems, that we named Hidden Layer Learning Vector Quantization (H-LVQ). It consists of applying Learning Vector Quantization to the last hidden layer of a MLP and it gave very successful results on problems containing a large number of correlated inputs. It was applied with excellent results on classification of Rurtherford
backscattering spectra and on a benchmark problem of image recognition. It may also be used for efficient feature extraction.
This document provides instructions for three exercises using artificial neural networks (ANNs) in Matlab: function fitting, pattern recognition, and clustering. It begins with background on ANNs including their structure, learning rules, training process, and common architectures. The exercises then guide using ANNs in Matlab for regression to predict house prices from data, classification of tumors as benign or malignant, and clustering of data. Instructions include loading data, creating and training networks, and evaluating results using both the GUI and command line. Improving results through retraining or adding neurons is also discussed.
Web spam classification using supervised artificial neural network algorithmsaciijournal
Due to the rapid growth in technology employed by the spammers, there is a need of classifiers that are more efficient, generic and highly adaptive. Neural Network based technologies have high ability of adaption as well as generalization. As per our knowledge, very little work has been done in this field using neural network. We present this paper to fill this gap. This paper evaluates performance of three supervised learning algorithms of artificial neural network by creating classifiers for the complex problem of latest web spam pattern classification. These algorithms are Conjugate Gradient algorithm, Resilient Backpropagation learning, and Levenberg-Marquardt algorithm.
A COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTIONIJCSEA Journal
Stock market price index prediction is a challenging task for investors and scholars. Artificial neural networks have been widely employed to predict financial stock market levels thanks to their ability to model nonlinear functions. The accuracy of backpropagation neural networks trained with different heuristic and numerical algorithms is measured for comparison purpose. It is found that numerical algorithm outperform heuristic techniques.
Efficiency of Neural Networks Study in the Design of TrussesIRJET Journal
The document examines the efficiency of different types of artificial neural networks (ANNs) in the design of trusses. It analyzes generalized regression, radial basis function, and linear layer neural networks using the MATLAB neural network tool. Various truss models are analyzed using the ANNs and STAAD Pro software. The ANNs are trained and tested for interpolation and extrapolation to calculate percentage errors. Parameters like spread constants, number of trainings, number of input/output variables are varied to study their effect on the ANN performance and efficiency. The study aims to determine the most suitable ANN type for truss design based on the percentage error results.
Neural Network Based Individual Classification SystemIRJET Journal
This document describes a neural network model developed for individual classification. The model was designed to measure personality traits through a questionnaire. It then uses a neural network trained on sample data through unsupervised and supervised learning with multi-layer perceptions. The backpropagation algorithm was used to train the network. The neural network architecture included multiple neuron layers trained on a 200 item data set, achieving 99.82% accuracy. The goal was to classify individuals into high, middle, and low personality categories for use in job selection or training.
This document discusses characterizing polymeric membranes under large deformations using an artificial neural network model. It presents an experimental study of blowing circular thermoplastic ABS membranes using free blowing technique. A multilayer neural network is used to model the non-linear behavior of the membrane under biaxial deformation. The neural network results are compared to experimental data and a finite difference model using a hyperelastic Mooney-Rivlin model. The neural network accurately reproduces the membrane behavior with minimal error margins compared to experimental measurements.
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...CSCJournals
The document compares the Levenberg-Marquardt and Scaled Conjugate Gradient algorithms for training a multilayer perceptron neural network for image compression. It finds that while both algorithms performed comparably in terms of accuracy and speed, the Levenberg-Marquardt algorithm achieved slightly better accuracy as measured by average training accuracy and mean squared error, while the Scaled Conjugate Gradient algorithm was faster as measured by average training iterations. The document compresses a standard test image called Lena using both algorithms and analyzes the results.
We propose an algorithm for training Multi Layer Preceptrons for classification problems, that we named Hidden Layer Learning Vector Quantization (H-LVQ). It consists of applying Learning Vector Quantization to the last hidden layer of a MLP and it gave very successful results on problems containing a large number of correlated inputs. It was applied with excellent results on classification of Rurtherford
backscattering spectra and on a benchmark problem of image recognition. It may also be used for efficient feature extraction.
This document provides instructions for three exercises using artificial neural networks (ANNs) in Matlab: function fitting, pattern recognition, and clustering. It begins with background on ANNs including their structure, learning rules, training process, and common architectures. The exercises then guide using ANNs in Matlab for regression to predict house prices from data, classification of tumors as benign or malignant, and clustering of data. Instructions include loading data, creating and training networks, and evaluating results using both the GUI and command line. Improving results through retraining or adding neurons is also discussed.
Web spam classification using supervised artificial neural network algorithmsaciijournal
Due to the rapid growth in technology employed by the spammers, there is a need of classifiers that are more efficient, generic and highly adaptive. Neural Network based technologies have high ability of adaption as well as generalization. As per our knowledge, very little work has been done in this field using neural network. We present this paper to fill this gap. This paper evaluates performance of three supervised learning algorithms of artificial neural network by creating classifiers for the complex problem of latest web spam pattern classification. These algorithms are Conjugate Gradient algorithm, Resilient Backpropagation learning, and Levenberg-Marquardt algorithm.
A COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTIONIJCSEA Journal
Stock market price index prediction is a challenging task for investors and scholars. Artificial neural networks have been widely employed to predict financial stock market levels thanks to their ability to model nonlinear functions. The accuracy of backpropagation neural networks trained with different heuristic and numerical algorithms is measured for comparison purpose. It is found that numerical algorithm outperform heuristic techniques.
Efficiency of Neural Networks Study in the Design of TrussesIRJET Journal
The document examines the efficiency of different types of artificial neural networks (ANNs) in the design of trusses. It analyzes generalized regression, radial basis function, and linear layer neural networks using the MATLAB neural network tool. Various truss models are analyzed using the ANNs and STAAD Pro software. The ANNs are trained and tested for interpolation and extrapolation to calculate percentage errors. Parameters like spread constants, number of trainings, number of input/output variables are varied to study their effect on the ANN performance and efficiency. The study aims to determine the most suitable ANN type for truss design based on the percentage error results.
Neural Network Based Individual Classification SystemIRJET Journal
This document describes a neural network model developed for individual classification. The model was designed to measure personality traits through a questionnaire. It then uses a neural network trained on sample data through unsupervised and supervised learning with multi-layer perceptions. The backpropagation algorithm was used to train the network. The neural network architecture included multiple neuron layers trained on a 200 item data set, achieving 99.82% accuracy. The goal was to classify individuals into high, middle, and low personality categories for use in job selection or training.
Application of support vector machines for prediction of anti hiv activity of...Alexander Decker
This document describes a study that used support vector machines (SVM) to develop a quantitative structure-activity relationship (QSAR) model to predict the anti-HIV activity of TIBO derivatives. The SVM model achieved high correlation (q2=0.96) and low error (RMSE=0.212), outperforming artificial neural networks and multiple linear regression models developed on the same data set. The results indicate that SVM is a valuable tool for QSAR modeling and predicting anti-HIV activity of chemical compounds.
ARTIFICIAL NEURAL NETWORK APPROACH TO MODELING OF POLYPROPYLENE REACTORijac123
This paper shows modeling of highly nonlinear polymerization process using the artificial neural network approach for the model predictive purposes. Polymerization occurs in a fluidized bed polypropylene reactor using Ziegler - Natta catalyst and the main objective was modeling of the reactor production rate.
The data set used for an identification of the model is a real process data received from an existing polypropylene plant and the identified model is a nonlinear autoregressive neural network with the exogenous input. Performance of a trained network has been verified using the real process data and the
ability of the production rate prediction is shown in the conclusion.
This paper proposes using the Levenberg-Marquardt algorithm for training a neural network to predict diabetes. The Levenberg-Marquardt algorithm is designed to minimize error functions and combines gradient descent and the Gauss-Newton method. The neural network is trained on a dataset from an Indian hospital to classify patients as diabetic or non-diabetic. Experimental results show that the Levenberg-Marquardt algorithm achieves the best validation performance of 0.00012359 with perfect correlation between outputs and targets, outperforming other backpropagation algorithms for this diabetes prediction task.
This document outlines the course details for Deep Learning for Data Science at SRM Institute of Science and Technology. The course is divided into 5 units that cover topics such as introduction to neural networks, artificial neural network architectures, neural network models like perceptrons and multilayer perceptrons, backpropagation algorithm, regularization techniques, convolutional neural networks, and reinforcement learning. The document provides an overview of the topics to be discussed each week for the different units.
Web Spam Classification Using Supervised Artificial Neural Network Algorithmsaciijournal
Due to the rapid growth in technology employed by the spammers, there is a need of classifiers that are
more efficient, generic and highly adaptive. Neural Network based technologies have high ability of
adaption as well as generalization. As per our knowledge, very little work has been done in this field using
neural network. We present this paper to fill this gap. This paper evaluates performance of three supervised
learning algorithms of artificial neural network by creating classifiers for the complex problem of latest
web spam pattern classification. These algorithms are Conjugate Gradient algorithm, Resilient Backpropagation learning, and Levenberg-Marquardt algorithm.
Artificial neural networks seminar presentation using MSWord.Mohd Faiz
This document provides an overview of artificial neural networks. It discusses neural network architectures including feedforward and recurrent networks. It covers neural network learning methods such as supervised learning, unsupervised learning, and reinforcement learning. Backpropagation is described as a method for training neural networks by calculating partial derivatives of the error function. Higher order learning algorithms and considerations for designing neural networks like choosing the number of hidden layers and activation functions are also summarized.
PSO-based Training, Pruning, and Ensembling of Extreme Learning Machine RBF N...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
The document discusses adaptive channel equalization using neural networks. It provides an overview of neural networks and their application to channel equalization. Specifically, it summarizes various neural network architectures that have been used for equalization, including multilayer perceptrons, functional link artificial neural networks, Chebyshev neural networks, and radial basis function networks. It compares the bit error rate performance of these different neural network equalizers with traditional linear equalizers such as LMS and RLS. Overall, the document finds that neural network equalizers can better handle nonlinear channel distortions compared to linear equalizers and that radial basis function networks provide particularly good performance for channel equalization applications.
Design of airfoil using backpropagation training with mixed approachEditor Jacotech
Levenberg-Marquardt back-propagation training method has some limitations associated with over fitting and local optimum problems. Here, we proposed a new algorithm to increase the convergence speed of Backpropagation learning to design the airfoil. The aerodynamic force coefficients corresponding to series of airfoil are stored in a database along with the airfoil coordinates. A feedforward neural network is created with aerodynamic coefficient as input to produce the airfoil coordinates as output. In the proposed algorithm, for output layer, we used the cost function having linear & nonlinear error terms then for the hidden layer, we used steepest descent cost function. Results indicate that this mixed approach greatly enhances the training of artificial neural network and may accurately predict airfoil profile.
Design of airfoil using backpropagation training with mixed approachEditor Jacotech
The document describes a new algorithm for designing airfoils using neural networks. The algorithm uses a mixed training approach: it trains the output layer of the neural network using a cost function with linear and nonlinear error terms for faster convergence, while training the hidden layer using steepest descent. Results show the mixed approach converges much faster than traditional backpropagation or Levenberg-Marquardt algorithms alone. The algorithm more accurately predicts airfoil profiles with fewer training iterations.
There are very few examples of the use of various architectures for recurrent neural
networks to predict student learning outcomes. In fact, the only architecture used to
solve this problem is the LSTM architecture. In the works devoted to the use of LSTM
to predict educational outcomes, the results of a detailed theoretical substantiation of
the preference of this particular architecture of the RNN are not presented. In this
regard, it seems advisable to provide such justification in the framework of this study.
The main property of input data for prediction of educational outcomes is its
temporary nature. Some sequence of user actions unfolds in time and is evaluated
(classified) by an external observer as evidence of the presence or absence of an
educational result (objective or metaobjective). In this regard, the RNN used to classify
user actions should perform a procedure for adjusting the weights of neurons for a
certain set of states in the past. At the same time, the length of the sequence of these
states is not predetermined: it can be both short (for example, for objective results),
and quite long.
Handwritten Digit Recognition using Convolutional Neural NetworksIRJET Journal
This document discusses using a convolutional neural network called LeNet to perform handwritten digit recognition on the MNIST dataset. It begins with an abstract that outlines using LeNet, a type of convolutional network, to accurately classify handwritten digits from 0 to 9. It then provides background on convolutional networks and how they can extract and utilize features from images to classify patterns with translation and scaling invariance. The document implements LeNet using the Keras deep learning library in Python to classify images from the MNIST dataset, which contains labeled images of handwritten digits. It analyzes the architecture of LeNet and how convolutional and pooling layers are used to extract features that are passed to fully connected layers for classification.
Neural network based numerical digits recognization using nnt in matlabijcses
Artificial neural networks are models inspired by human nervous system that is capable of learning. One of
the important applications of artificial neural network is character Recognition. Character Recognition
finds its application in number of areas, such as banking, security products, hospitals, in robotics also.
This paper is based on a system that recognizes a english numeral, given by the user, which is already
trained on the features of the numbers to be recognized using NNT (Neural network toolbox) .The system
has a neural network as its core, which is first trained on a database. The training of the neural network
extracts the features of the English numbers and stores in the database. The next phase of the system is to
recognize the number given by the user. The features of the number given by the user are extracted and
compared with the feature database and the recognized number is displayed.
This document discusses using the Levenberg-Marquardt algorithm for forecasting stock exchange share rates on the Karachi Stock Exchange. It provides an overview of artificial neural networks and how they can be used for financial forecasting applications. The Levenberg-Marquardt algorithm is presented as an efficient method for training neural networks to minimize errors through gradient descent. The document applies this method to train a neural network to predict the direction of change in share prices on the Karachi Stock Exchange. The network is trained on historical stock price data and testing shows it can achieve the performance goal of forecasting next day price changes.
Deep learning algorithms have drawn the attention of researchers working in the field of computer vision, speech
recognition, malware detection, pattern recognition and natural language processing. In this paper, we present an overview of
deep learning techniques like Convolutional neural network, deep belief network, Autoencoder, Restricted Boltzmann machine
and recurrent neural network. With this, current work of deep learning algorithms on malware detection is shown with the
help of literature survey. Suggestions for future research are given with full justification. We also showed the experimental
analysis in order to show the importance of deep learning techniques.
This document presents a method to estimate the weights of main materials (copper, iron, oil) for transformers using an artificial neural network (ANN) with the Levenberg-Marquardt backpropagation algorithm. Training data consisting of 24 input/output pairs obtained from a transformer manufacturing company are used to train the ANN, with inputs like short circuit impedance, installation height, and temperature and outputs of material weights. The trained ANN is then able to accurately estimate the material weights of transformers, providing a method to forecast transformer costs for manufacturers.
This document discusses the use of artificial neural networks (ANNs) for process control applications. It covers several key topics:
1) ANNs can model nonlinear systems through parallel processing and learning algorithms like backpropagation. Multi-layer neural networks are commonly used for pattern recognition and control.
2) Various ANN-based control configurations are described, including direct inverse control, direct adaptive control, and internal model control.
3) Learning algorithms like backpropagation and applications like system identification, modeling, fault detection, and temperature control are discussed.
4) The document concludes that multi-layer neural networks trained with backpropagation are well-suited for process identification and control, as they can handle nonlinearity using
This document discusses applications of wavelet transforms and neural networks in earthquake geotechnical engineering. It contains the following sections:
1. Definitions of wavelet transforms and neural networks
2. Applications of each technique, including using wavelets to remove noise from statistical data and neural networks for structural system identification.
3. The main applications discussed are using wavelet transforms to process earthquake ground motion data by removing noise, and using neural networks to model complex geotechnical systems and predict soil behavior under earthquake loading.
This article provides an introduction to artificial neural networks (ANNs) and presents guidelines for designing effective ANN solutions. It discusses the key components of ANNs, including their biological inspiration, history, and different types of learning algorithms. The article emphasizes that successful ANN development requires extensive domain knowledge engineering and following best practices for selecting input variables, learning methods, architecture, and training samples. Specifically, it recommends knowledge-based input selection, choosing appropriate learning algorithms based on the data type, designing network topology based on the algorithm, and selecting optimal training set sizes, especially for time series problems. Overall, the article stresses that incorporating domain expertise at each design step is essential for building ANNs that generalize well to new problems.
Live to learn: learning rules-based artificial neural networknooriasukmaningtyas
An extensive review of the artificial neural network (ANN) is presented in
this paper. Previous studies review the artificial neural network (ANN) based
on the approaches (algorithms) used or based on the types of the artificial
neural network (ANN). The presented paper reviews the ANN based on the
goal of the ANN (methods, and layers), which become the main objective of
this paper. As a famous artificial intelligent model, ANN mimics the human
nervous system in handling the information transmited by different nodes
(also known as neurons) in this model. These nodes are stacked in layers and
work collectively to bring about solution to complex problems. Numerous
structures exist for ANN and each of these structures is designed to addressa
a specific task. Basically, the ANN architecture is comprised of 3 different
layers wherein the first layer rpresents the input layer that consist of several
input nodes that represent the input parameterfor the model. The hidden layer
is te second layer and consists of a hidden layer of neurons. The neurons in
this layer are directly connected to the neurons in the output layer. The third
layer is the output layer which is the models’ response layer. The output
layer neurons have the activation functions for the calculation of the ANN
final output. The connection between the nodes in the ANN model is
mediated by synaptic weights. This paper is a comprehensive study of ANN
models and their layers.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Application of support vector machines for prediction of anti hiv activity of...Alexander Decker
This document describes a study that used support vector machines (SVM) to develop a quantitative structure-activity relationship (QSAR) model to predict the anti-HIV activity of TIBO derivatives. The SVM model achieved high correlation (q2=0.96) and low error (RMSE=0.212), outperforming artificial neural networks and multiple linear regression models developed on the same data set. The results indicate that SVM is a valuable tool for QSAR modeling and predicting anti-HIV activity of chemical compounds.
ARTIFICIAL NEURAL NETWORK APPROACH TO MODELING OF POLYPROPYLENE REACTORijac123
This paper shows modeling of highly nonlinear polymerization process using the artificial neural network approach for the model predictive purposes. Polymerization occurs in a fluidized bed polypropylene reactor using Ziegler - Natta catalyst and the main objective was modeling of the reactor production rate.
The data set used for an identification of the model is a real process data received from an existing polypropylene plant and the identified model is a nonlinear autoregressive neural network with the exogenous input. Performance of a trained network has been verified using the real process data and the
ability of the production rate prediction is shown in the conclusion.
This paper proposes using the Levenberg-Marquardt algorithm for training a neural network to predict diabetes. The Levenberg-Marquardt algorithm is designed to minimize error functions and combines gradient descent and the Gauss-Newton method. The neural network is trained on a dataset from an Indian hospital to classify patients as diabetic or non-diabetic. Experimental results show that the Levenberg-Marquardt algorithm achieves the best validation performance of 0.00012359 with perfect correlation between outputs and targets, outperforming other backpropagation algorithms for this diabetes prediction task.
This document outlines the course details for Deep Learning for Data Science at SRM Institute of Science and Technology. The course is divided into 5 units that cover topics such as introduction to neural networks, artificial neural network architectures, neural network models like perceptrons and multilayer perceptrons, backpropagation algorithm, regularization techniques, convolutional neural networks, and reinforcement learning. The document provides an overview of the topics to be discussed each week for the different units.
Web Spam Classification Using Supervised Artificial Neural Network Algorithmsaciijournal
Due to the rapid growth in technology employed by the spammers, there is a need of classifiers that are
more efficient, generic and highly adaptive. Neural Network based technologies have high ability of
adaption as well as generalization. As per our knowledge, very little work has been done in this field using
neural network. We present this paper to fill this gap. This paper evaluates performance of three supervised
learning algorithms of artificial neural network by creating classifiers for the complex problem of latest
web spam pattern classification. These algorithms are Conjugate Gradient algorithm, Resilient Backpropagation learning, and Levenberg-Marquardt algorithm.
Artificial neural networks seminar presentation using MSWord.Mohd Faiz
This document provides an overview of artificial neural networks. It discusses neural network architectures including feedforward and recurrent networks. It covers neural network learning methods such as supervised learning, unsupervised learning, and reinforcement learning. Backpropagation is described as a method for training neural networks by calculating partial derivatives of the error function. Higher order learning algorithms and considerations for designing neural networks like choosing the number of hidden layers and activation functions are also summarized.
PSO-based Training, Pruning, and Ensembling of Extreme Learning Machine RBF N...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
The document discusses adaptive channel equalization using neural networks. It provides an overview of neural networks and their application to channel equalization. Specifically, it summarizes various neural network architectures that have been used for equalization, including multilayer perceptrons, functional link artificial neural networks, Chebyshev neural networks, and radial basis function networks. It compares the bit error rate performance of these different neural network equalizers with traditional linear equalizers such as LMS and RLS. Overall, the document finds that neural network equalizers can better handle nonlinear channel distortions compared to linear equalizers and that radial basis function networks provide particularly good performance for channel equalization applications.
Design of airfoil using backpropagation training with mixed approachEditor Jacotech
Levenberg-Marquardt back-propagation training method has some limitations associated with over fitting and local optimum problems. Here, we proposed a new algorithm to increase the convergence speed of Backpropagation learning to design the airfoil. The aerodynamic force coefficients corresponding to series of airfoil are stored in a database along with the airfoil coordinates. A feedforward neural network is created with aerodynamic coefficient as input to produce the airfoil coordinates as output. In the proposed algorithm, for output layer, we used the cost function having linear & nonlinear error terms then for the hidden layer, we used steepest descent cost function. Results indicate that this mixed approach greatly enhances the training of artificial neural network and may accurately predict airfoil profile.
Design of airfoil using backpropagation training with mixed approachEditor Jacotech
The document describes a new algorithm for designing airfoils using neural networks. The algorithm uses a mixed training approach: it trains the output layer of the neural network using a cost function with linear and nonlinear error terms for faster convergence, while training the hidden layer using steepest descent. Results show the mixed approach converges much faster than traditional backpropagation or Levenberg-Marquardt algorithms alone. The algorithm more accurately predicts airfoil profiles with fewer training iterations.
There are very few examples of the use of various architectures for recurrent neural
networks to predict student learning outcomes. In fact, the only architecture used to
solve this problem is the LSTM architecture. In the works devoted to the use of LSTM
to predict educational outcomes, the results of a detailed theoretical substantiation of
the preference of this particular architecture of the RNN are not presented. In this
regard, it seems advisable to provide such justification in the framework of this study.
The main property of input data for prediction of educational outcomes is its
temporary nature. Some sequence of user actions unfolds in time and is evaluated
(classified) by an external observer as evidence of the presence or absence of an
educational result (objective or metaobjective). In this regard, the RNN used to classify
user actions should perform a procedure for adjusting the weights of neurons for a
certain set of states in the past. At the same time, the length of the sequence of these
states is not predetermined: it can be both short (for example, for objective results),
and quite long.
Handwritten Digit Recognition using Convolutional Neural NetworksIRJET Journal
This document discusses using a convolutional neural network called LeNet to perform handwritten digit recognition on the MNIST dataset. It begins with an abstract that outlines using LeNet, a type of convolutional network, to accurately classify handwritten digits from 0 to 9. It then provides background on convolutional networks and how they can extract and utilize features from images to classify patterns with translation and scaling invariance. The document implements LeNet using the Keras deep learning library in Python to classify images from the MNIST dataset, which contains labeled images of handwritten digits. It analyzes the architecture of LeNet and how convolutional and pooling layers are used to extract features that are passed to fully connected layers for classification.
Neural network based numerical digits recognization using nnt in matlabijcses
Artificial neural networks are models inspired by human nervous system that is capable of learning. One of
the important applications of artificial neural network is character Recognition. Character Recognition
finds its application in number of areas, such as banking, security products, hospitals, in robotics also.
This paper is based on a system that recognizes a english numeral, given by the user, which is already
trained on the features of the numbers to be recognized using NNT (Neural network toolbox) .The system
has a neural network as its core, which is first trained on a database. The training of the neural network
extracts the features of the English numbers and stores in the database. The next phase of the system is to
recognize the number given by the user. The features of the number given by the user are extracted and
compared with the feature database and the recognized number is displayed.
This document discusses using the Levenberg-Marquardt algorithm for forecasting stock exchange share rates on the Karachi Stock Exchange. It provides an overview of artificial neural networks and how they can be used for financial forecasting applications. The Levenberg-Marquardt algorithm is presented as an efficient method for training neural networks to minimize errors through gradient descent. The document applies this method to train a neural network to predict the direction of change in share prices on the Karachi Stock Exchange. The network is trained on historical stock price data and testing shows it can achieve the performance goal of forecasting next day price changes.
Deep learning algorithms have drawn the attention of researchers working in the field of computer vision, speech
recognition, malware detection, pattern recognition and natural language processing. In this paper, we present an overview of
deep learning techniques like Convolutional neural network, deep belief network, Autoencoder, Restricted Boltzmann machine
and recurrent neural network. With this, current work of deep learning algorithms on malware detection is shown with the
help of literature survey. Suggestions for future research are given with full justification. We also showed the experimental
analysis in order to show the importance of deep learning techniques.
This document presents a method to estimate the weights of main materials (copper, iron, oil) for transformers using an artificial neural network (ANN) with the Levenberg-Marquardt backpropagation algorithm. Training data consisting of 24 input/output pairs obtained from a transformer manufacturing company are used to train the ANN, with inputs like short circuit impedance, installation height, and temperature and outputs of material weights. The trained ANN is then able to accurately estimate the material weights of transformers, providing a method to forecast transformer costs for manufacturers.
This document discusses the use of artificial neural networks (ANNs) for process control applications. It covers several key topics:
1) ANNs can model nonlinear systems through parallel processing and learning algorithms like backpropagation. Multi-layer neural networks are commonly used for pattern recognition and control.
2) Various ANN-based control configurations are described, including direct inverse control, direct adaptive control, and internal model control.
3) Learning algorithms like backpropagation and applications like system identification, modeling, fault detection, and temperature control are discussed.
4) The document concludes that multi-layer neural networks trained with backpropagation are well-suited for process identification and control, as they can handle nonlinearity using
This document discusses applications of wavelet transforms and neural networks in earthquake geotechnical engineering. It contains the following sections:
1. Definitions of wavelet transforms and neural networks
2. Applications of each technique, including using wavelets to remove noise from statistical data and neural networks for structural system identification.
3. The main applications discussed are using wavelet transforms to process earthquake ground motion data by removing noise, and using neural networks to model complex geotechnical systems and predict soil behavior under earthquake loading.
This article provides an introduction to artificial neural networks (ANNs) and presents guidelines for designing effective ANN solutions. It discusses the key components of ANNs, including their biological inspiration, history, and different types of learning algorithms. The article emphasizes that successful ANN development requires extensive domain knowledge engineering and following best practices for selecting input variables, learning methods, architecture, and training samples. Specifically, it recommends knowledge-based input selection, choosing appropriate learning algorithms based on the data type, designing network topology based on the algorithm, and selecting optimal training set sizes, especially for time series problems. Overall, the article stresses that incorporating domain expertise at each design step is essential for building ANNs that generalize well to new problems.
Live to learn: learning rules-based artificial neural networknooriasukmaningtyas
An extensive review of the artificial neural network (ANN) is presented in
this paper. Previous studies review the artificial neural network (ANN) based
on the approaches (algorithms) used or based on the types of the artificial
neural network (ANN). The presented paper reviews the ANN based on the
goal of the ANN (methods, and layers), which become the main objective of
this paper. As a famous artificial intelligent model, ANN mimics the human
nervous system in handling the information transmited by different nodes
(also known as neurons) in this model. These nodes are stacked in layers and
work collectively to bring about solution to complex problems. Numerous
structures exist for ANN and each of these structures is designed to addressa
a specific task. Basically, the ANN architecture is comprised of 3 different
layers wherein the first layer rpresents the input layer that consist of several
input nodes that represent the input parameterfor the model. The hidden layer
is te second layer and consists of a hidden layer of neurons. The neurons in
this layer are directly connected to the neurons in the output layer. The third
layer is the output layer which is the models’ response layer. The output
layer neurons have the activation functions for the calculation of the ANN
final output. The connection between the nodes in the ANN model is
mediated by synaptic weights. This paper is a comprehensive study of ANN
models and their layers.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Null Bangalore | Pentesters Approach to AWS IAMDivyanshu
#Abstract:
- Learn more about the real-world methods for auditing AWS IAM (Identity and Access Management) as a pentester. So let us proceed with a brief discussion of IAM as well as some typical misconfigurations and their potential exploits in order to reinforce the understanding of IAM security best practices.
- Gain actionable insights into AWS IAM policies and roles, using hands on approach.
#Prerequisites:
- Basic understanding of AWS services and architecture
- Familiarity with cloud security concepts
- Experience using the AWS Management Console or AWS CLI.
- For hands on lab create account on [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
# Scenario Covered:
- Basics of IAM in AWS
- Implementing IAM Policies with Least Privilege to Manage S3 Bucket
- Objective: Create an S3 bucket with least privilege IAM policy and validate access.
- Steps:
- Create S3 bucket.
- Attach least privilege policy to IAM user.
- Validate access.
- Exploiting IAM PassRole Misconfiguration
-Allows a user to pass a specific IAM role to an AWS service (ec2), typically used for service access delegation. Then exploit PassRole Misconfiguration granting unauthorized access to sensitive resources.
- Objective: Demonstrate how a PassRole misconfiguration can grant unauthorized access.
- Steps:
- Allow user to pass IAM role to EC2.
- Exploit misconfiguration for unauthorized access.
- Access sensitive resources.
- Exploiting IAM AssumeRole Misconfiguration with Overly Permissive Role
- An overly permissive IAM role configuration can lead to privilege escalation by creating a role with administrative privileges and allow a user to assume this role.
- Objective: Show how overly permissive IAM roles can lead to privilege escalation.
- Steps:
- Create role with administrative privileges.
- Allow user to assume the role.
- Perform administrative actions.
- Differentiation between PassRole vs AssumeRole
Try at [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
artificial intelligence and data science contents.pptxGauravCar
What is artificial intelligence? Artificial intelligence is the ability of a computer or computer-controlled robot to perform tasks that are commonly associated with the intellectual processes characteristic of humans, such as the ability to reason.
› ...
Artificial intelligence (AI) | Definitio
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
2. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
2 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
boundedness of the modified Levenberg–Marquardt algorithm
can be assured based on the Lyapunov technique; therefore,
the artificial neural network outputs and weights of the mod-
ified Levenberg–Marquardt algorithm remain bounded during
all the training and testing.
In [25]–[27], there is an interesting procedure to compute
the Levenberg–Marquardt and Newton algorithms for an arti-
ficial neural network with multiple hidden layers that are
useful in the deep learning. Different to the abovementioned
work, this article computes the modified Levenberg–Marquardt
algorithm for an artificial neural network with a single hidden
layer because of the following four reasons: 1) we show
that the two-hidden-layer Levenberg–Marquardt and Newton
algorithms are worse than the Levenberg–Marquardt and New-
ton algorithms because the Levenberg–Marquardt and Newton
algorithms present one singularity point, while the two-hidden-
layer Levenberg–Marquardt and Newton algorithms present
three singularity points; 2) there is a computational concern
that computing the inverse of the Levenberg–Marquardt and
Newton algorithms for an artificial neural network with mul-
tiple hidden layers would be very expensive; 3) in [28]–[30],
they show based on the Stone–Weierstrass theorem that the
targets can be arbitrarily well approximated by an artificial
neural network with a single hidden layer and a hyperbolic
tangent function; and 4) this article is mainly focused in
assuring the stability of the modified Levenberg–Marquardt
algorithm for an artificial neural network with a single hidden
layer.
Finally, we compare the artificial neural network learn-
ing with the modified Levenberg–Marquardt, the Levenberg–
Marquardt algorithm [8]–[11], the Newton algorithm [1], [2],
the stable gradient algorithm in a neural network [31], [32],
and the stable gradient algorithm in a radial basis function
neural network [33], [34] for the learning of the electric and
brain signals data set. The electric signal data set information
is obtained from electricity load and price forecasting with
MATLAB where the details are explained in [35]. The brain
signal data set information is obtained from our laboratory
where the details are explained in [36].
The remainder of this article is organized as follows.
Section II presents the Levenberg–Marquardt and Newton
algorithms for artificial neural network learning. Section III
discusses the two-hidden-layer Levenberg–Marquardt and
Newton algorithms for the two-hidden-layer artificial neural
network learning. Section IV introduces the modified
Levenberg–Marquardt for the artificial neural network learn-
ing, and the error stability and weights boundedness are
assured. Section V shows the comparison results of several
algorithms for the learning of the electric and brain signals
data set. In Section VI, conclusions and forthcoming work are
detailed.
II. LEVENBERG–MARQUARDT AND NEWTON
ALGORITHMS FOR THE ARTIFICIAL
NEURAL NETWORK LEARNING
The algorithms for the artificial neural network learning
frequently evaluate the first derivative of the cost function with
Fig. 1. Artificial neural network.
respect to the weights. Nevertheless, there are several cases
where it is interesting to evaluate the second derivatives of the
cost function with respect to the weights. The second-order
partial derivatives of the cost function with respect to the
weights are known as the Hessian.
A. Hessian for the Artificial Neural Network Learning
In this article, we use a special artificial neural network with
one hidden layer. It could be extended to a general multilayer
artificial neural network; nevertheless, this research is focused
on a compact artificial neural network. This artificial neural
network uses hyperbolic tangent functions in the hidden layer
and linear functions in the output layer. We define the artificial
neural network as
dl,k =
j
qlj,k g
i
pji,kai,k
(1)
where pji,k are the weights of the hidden layer, qlj,k are the
weights of the output layer, g(·) are the activation functions,
ai,k are the artificial neural network inputs, dl,k are the artificial
neural network outputs, i is the input layer, j is the hidden
layer, l is the output layer, and k is the iteration.
We consider the artificial neural network of Fig. 1.
We define pji,k as the weights of the hidden layer and qlj,k as
the weights of the output layer.
We define the cost function Ek as
Ek =
1
2
LT
l=1
dl,k − tl,k
2
(2)
where dl,k are the artificial neural network outputs, tl,k are
the data set targets, and LT is the total outputs number. The
second-order partial derivatives of the cost function Ek with
respect to the weights pji,k and qlj,k will be used to obtain the
Newton and Levenberg–Marquardt algorithms.
We consider the forward propagation as
z j,k =
i
pji,kai,k, cj,k = g
z j,k
xl,k =
j
qlj,kcj,k, dl,k = f
xl,k
= xl,k (3)
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
3. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
RUBIO: STABILITY ANALYSIS OF THE MODIFIED LEVENBERG–MARQUARDT ALGORITHM 3
where ai,k are the artificial neural network inputs and dl,k are
the artificial neural network outputs, pji,k are hidden layer
weights, and qlj,k are output layer weights.
We consider the activation functions in the hidden layer as
the hyperbolic tangent functions
g
z j,k
=
ez j,k
− e−z j,k
ez j,k + e−z j,k
= tanh
z j,k
. (4)
The first and second derivatives of the hyperbolic tangent
functions (4) are
g/
z j,k
=
4
ez j,k + e−z j,k
2
= sec h2
z j,k
g//
z j,k
= −2
ez j,k
− e−z j,k
ez j,k + e−z j,k
4
ez j,k + e−z j,k
2
= −2 tanh
z j,k
sec h2
z j,k
=−2g
z j,k
g/
z j,k
. (5)
We consider the activation functions of the output layer as
the linear functions
f
xl,k
= xl,k. (6)
The first and second derivatives of the linear functions (6) are
f /
xl,k
= 1, f //
xl,k
= 0. (7)
The first and second derivatives of the cost function (2) are
∂ Ek
∂dl,k
=
dl,k − tl,k
,
∂2
Ek
∂d2
l,k
= 1. (8)
Using the cost function (2), we obtain the backpropagation
of the output layer as
∂ Ek
∂qlj,k
=
∂ Ek
∂dl,k
∂dl,k
∂xl,k
∂xl,k
∂qlj,k
=
dl,k − tl,k
∂ f
xl,k
∂xl,k
cj,k
=
dl,k − tl,k
∂xl,k
∂xl,k
cj,k =
dl,k − tl,k
(1)g
z j,k
=
dl,k − tl,k
g
z j,k
(9)
where f (xl,k ) = xl,k of (6) and g(z j,k) = tanh(z j,k) of (4).
Using the cost function (2), we obtain the backpropagation
of the hidden layer as
∂ Ek
∂pji,k
=
∂ Ek
∂dl,k
∂dl,k
∂xl,k
∂xl,k
∂cj,k
∂cj,k
∂z j,k
∂z j,k
∂pji,k
=
dl,k − tl,k
(1)qlj,k g/
z j,k
ai,k
=
dl,k − tl,k
qlj,k g/
z j,k
ai,k (10)
where g/
(z j,k) = (∂cj,k/∂z j,k) = (∂g(z j,k)/∂z j,k) =
sec h2
(z j,k) of (5).
We define the second derivative of Ek as the Hessian Hk
[25]–[27]
Hk = ∇∇Ek =
⎡
⎢
⎢
⎢
⎣
∂2
Ek
∂p2
ji,k
∂2
Ek
∂pji,k∂qlj,k
∂2
Ek
∂pji,k∂qlj,k
∂2
Ek
∂q2
lj,k
⎤
⎥
⎥
⎥
⎦
(11)
where the Hessian is symmetrical
∂2
Ek
∂pji,k∂qjl,k
=
∂2
Ek
∂qjl,k∂pji,k
. (12)
The Hessian elements are
∂2
Ek
∂p2
ji,k
= a2
i,kqlj,k g//
z j,k
σi,k + g/
z j,k
2
qlj,k Si,k
∂2
Ek
∂pji,k∂qlj,k
= ai,k g/
z j,k
σi,k + cj,kqlj,k Si,k
∂2
Ek
∂q2
lj,k
= c2
j,k f //
xl,k
σi,k + f /
xl,k
2
Si,k
(13)
where
Si,k =
∂2
Ek
∂d2
l,k
= 1
g/
z j,k
= sec h2
z j,k
, f /
xl,k
= 1
g//
z j,k
= −2 tanh
z j,k
sec h2
z j,k
, f //
xl,k
= 0
cj,k =
∂xl,k
∂qlj,k
= g
z j,k
, ai,k =
∂z j,k
∂pji,k
g
z j,k
= tanh
z j,k
, f
xl,k
= xl,k, σi,k =
dl,k − tl,k
.
We substitute the elements of (13) and (11); then, the
Hessian is
Hk = ∇∇Ek =
⎡
⎢
⎢
⎢
⎣
∂2
Ek
∂p2
ji,k
∂2
Ek
∂pji,k∂qlj,k
∂2
Ek
∂pji,k∂qlj,k
∂2
Ek
∂q2
lj,k
⎤
⎥
⎥
⎥
⎦
∂2
Ek
∂p2
ji,k
=
a2
i,kqlj,k ∗ −2g
z j,k
g/
z j,k
dl,k − tl,k
+g/
z j,k
2
qlj,k
∂2
Ek
∂pji,k∂qlj,k
= ai,k g/
z j,k
dl,k − tl,k
+ g
z j,k
qlj,k
∂2
Ek
∂q2
lj,k
= g
z j,k
2
(14)
where ai,k are the artificial neural network inputs, dl,k are the
artificial neural network outputs, g(z j,k) = tanh(z j,k) are the
activation functions, g/
(z j,k) = sec h2
(z j,k) are the deriva-
tives of the activation functions, tl,k are the data set targets,
z j,k = pji,kai,k are the hidden layer outputs, qlj,k are the
weights of the hidden layer.
In the next step, we evaluate the Hessian with the
Levenberg–Marquardt and Newton algorithms.
B. Newton Algorithm
The Newton algorithm constitutes the first alternative to
update the weights for the artificial neural network learning.
We represent the updating of the Newton algorithm as [1], [2]
pji,k+1
qlj,k+1
=
pji,k
qlj,k
− α[Hk]−1
⎡
⎢
⎢
⎣
∂ Ek
∂pji,k
∂ Ek
∂qlj,k
⎤
⎥
⎥
⎦
Hk =
βC,k βE,k
βE,k βD,k
βC,k =
jil
∂2
Ek
∂p2
ji,k
, βD,k =
j
∂2
Ek
∂q2
lj,k
βE,k =
jil
∂2
Ek
∂pji,k∂qlj,k
,
jil
=
j
i
l
(15)
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
4. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
4 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
where the elements (∂2
Ek/∂p2
ji,k), (∂2
Ek/∂q2
lj,k), and
(∂2
Ek/∂pji,k∂qlj,k) are in (14), the elements (∂ Ek/∂qlj,k) and
(∂ Ek/∂pji,k) are in (9) and (10), pji,k and qlj,k are the weights,
and α is the learning factor. The Newton algorithm requires
the existence of the inverse in the Hessian ([Hk]−1
).
Now, we will represent the Newton algorithm of (15) in the
scalar form. First, from (15), we obtain the inverse of Hk as
[Hk]−1
=
βC,k βE,k
βE,k βD,k
−1
=
1
det[Hk]
βD,k −βE,k
−βE,k βC,k
det[Hk] =
βC,k
βD,k
−
βE,k
2
βC,k =
jil
∂2
Ek
∂p2
ji,k
, βD,k =
j
∂2
Ek
∂q2
lj,k
βE,k =
jil
∂2
Ek
∂pji,k∂qlj,k
,
jil
=
j
i
l
. (16)
We substitute [Hk]−1
of (16) into (15) as
pji,k+1
qlj,k+1
=
pji,k
qlj,k
−
α
det[Hk]
βD,k −βE,k
−βE,k βC,k
⎡
⎢
⎢
⎣
∂ Ek
∂pji,k
∂ Ek
∂qlj,k
⎤
⎥
⎥
⎦
det[Hk] =
βC,k
βD,k
−
βE,k
2
. (17)
Rewriting (17) in the scalar form is
pji,k+1 = pji,k − βN ji,k
∂ Ek
∂pji,k
+ γN,k
∂ Ek
∂qlj,k
qlj,k+1 = qlj,k − βNlj,k
∂ Ek
∂qlj,k
+ γN,k
∂ Ek
∂pji,k
βN ji,k = α
βD,k
det[Hk]
, βNlj,k = α
βC,k
det[Hk]
γN,k = α
βE,k
det[Hk]
det[Hk]N = det[Hk] =
βC,k
βD,k
−
βE,k
2
βC,k =
jil
∂2
Ek
∂p2
ji,k
, βD,k =
j
∂2
Ek
∂q2
lj,k
βE,k =
jil
∂2
Ek
∂pji,k∂qlj,k
,
jil
=
j
i
l
(18)
where
∂ Ek
∂pji,k
=
dl,k − tl,k
qlj,k g/
z j,k
ai,k
∂ Ek
∂qlj,k
= g
z j,k
dl,k − tl,k
∂2
Ek
∂p2
ji,k
=
a2
i,kqlj,k ∗ −2g
z j,k
g/
z j,k
dl,k − tl,k
+g/
z j,k
2
qlj,k
∂2
Ek
∂q2
lj,k
= g
z j,k
2
∂2
Ek
∂pji,k∂qlj,k
= ai,k g/
z j,k
dl,k − tl,k
+ g
z j,k
qlj,k
. (19)
βN ji,k, βNlj,k, and γN,k are the learning rates, pji,k and qlj,k
are the weights, α is the learning factor, g(z j,k) = tanh(z j,k)
are the activation functions, and g/
(z j,k) = sec h2
(z j,k) are the
derivative of the activation functions. Equations (23) and (24)
describe the Newton algorithm.
Remark 1: In the Newton algorithm of (18) and (19),
we can observe that a value of zero in (βC,k)(βD,k) − (βE,k)2
of det[Hk]N is a singularity point in the learning rates βN ji,k,
βNlj,k, and γN,k. It results that the Newton algorithm error
is not assured to be stable. Hence, it would be interesting
to consider other alternative algorithm for the artificial neural
network learning.
C. Levenberg–Marquardt Algorithm
The Levenberg–Marquardt algorithm constitutes the second
alternative to update the weights for the artificial neural
network learning. We represent the basic updating of the
Levenberg–Marquardt algorithm as [8]–[11]
pji,k+1
qlj,k+1
=
pji,k
qlj,k
− [Hk + αI]−1
⎡
⎢
⎢
⎣
∂ Ek
∂pji,k
∂ Ek
∂qlj,k
⎤
⎥
⎥
⎦
Hk =
βC,k βE,k
βE,k βD,k
βC,k =
jil
∂2
Ek
∂p2
ji,k
, βD,k =
j
∂2
Ek
∂q2
lj,k
βE,k =
jil
∂2
Ek
∂pji,k∂qlj,k
,
jil
=
j
i
l
(20)
where the elements (∂2
Ek/∂p2
ji,k), (∂2
Ek/∂q2
lj,k), and
(∂2
Ek/∂pji,k∂qlj,k) are in (14), the elements (∂ Ek/∂qlj,k)
and (∂ Ek/∂pji,k) are in (9) and (10), pji,k and qlj,k are
the weights, and α is the learning factor. The Levenberg–
Marquardt algorithm requires the existence of the inverse in
the Hessian [Hk + αI]−1
.
Now, we will represent the Levenberg–Marquardt algorithm
of (20) in the scalar form. First, from (20), we obtain the
inverse of Hk + αI as
[Hk + αI]−1
=
α + βC,k βE,k
βE,k α + βD,k
−1
=
1
det[Hk + αI]
α + βD,k −βE,k
−βE,k α + βC,k
det[Hk + αI] =
α +
βC,k
α +
βD,k
−
βE,k
2
βC,k =
jil
∂2
Ek
∂p2
ji,k
, βD,k =
j
∂2
Ek
∂q2
lj,k
βE,k =
jil
∂2
Ek
∂pji,k∂qlj,k
,
jil
=
j
i
l
. (21)
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
5. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
RUBIO: STABILITY ANALYSIS OF THE MODIFIED LEVENBERG–MARQUARDT ALGORITHM 5
We substitute [Hk + αI]−1
into (20) as
pji,k+1
qlj,k+1
=
pji,k
qlj,k
−
α
det[Hk +αI]
α+βD,k −βE,k
−βE,k α+βC,k
⎡
⎢
⎢
⎣
∂ Ek
∂pji,k
∂ Ek
∂qlj,k
⎤
⎥
⎥
⎦
det[Hk +αI] =
α +
βC,k
α +
βD,k
−
βE,k
2
. (22)
Rewriting (22) in the scalar form is
pji,k+1 = pji,k − βLMji,k
∂ Ek
∂pji,k
+ γLM,k
∂ Ek
∂qlj,k
qlj,k+1 = qlj,k − βLMlj,k
∂ Ek
∂qlj,k
+ γLM,k
∂ Ek
∂pji,k
βLMji,k =
α +
βD,k
det[Hk + αI]
, βLMlj,k =
α +
βC,k
det[Hk + αI]
γLM,k =
βE,k
det[Hk + αI]
det[Hk]LM = det[Hk + αI]
=
α +
βC,k
α +
βD,k
−
βE,k
2
βC,k =
jil
∂2
Ek
∂p2
ji,k
, βD,k =
j
∂2
Ek
∂q2
lj,k
βE,k =
jil
∂2
Ek
∂pji,k∂qlj,k
,
jil
=
j
i
l
(23)
where
∂ Ek
∂pji,k
=
dl,k − tl,k
qlj,k g/
z j,k
ai,k
∂ Ek
∂qlj,k
= g
z j,k
dl,k − tl,k
∂2
Ek
∂p2
ji,k
=
a2
i,kqlj,k ∗ −2g
z j,k
g/
z j,k
dl,k − tl,k
+ g/
z j,k
2
qlj,k
∂2
Ek
∂q2
lj,k
= g
z j,k
2
∂2
Ek
∂pji,k∂qlj,k
= ai,k g/
z j,k
dl,k − tl,k
+ g
z j,k
qlj,k
. (24)
βLMji,k, βLMlj,k , and γLM,k are the learning rates, pji,k and qlj,k
are the weights, α is the learning factor, g(z j,k) = tanh(z j,k)
are the activation functions, and g/
(z j,k) = sec h2
(z j,k) are the
derivative of the activation functions. Equations (23) and (24)
describe the Levenberg–Marquardt algorithm.
Remark 2: In the Levenberg–Marquardt algorithm of
(23) and (24), we can observe that a value of zero in (α +
(βC,k))(α+(βD,k ))−(βE,k)2
of det[Hk]LM is a singularity point
in the learning rates βLMji,k, βLMlj,k, and γLM,k. It results that
the Levenberg–Marquardt algorithm error is not assured to be
stable. Hence, it should be interesting to find a way to modify
the Levenberg–Marquardt algorithm to make its error stable.
Fig. 2. Two-hidden-layer artificial neural network.
III. TWO-HIDDEN-LAYER LEVENBERG–MARQUARDT
AND NEWTON ALGORITHMS FOR THE ARTIFICIAL
NEURAL NETWORK LEARNING
In this section, the two-hidden-layer Levenberg–Marquardt
and Newton algorithms are presented as a comparison with the
Levenberg–Marquardt and Newton algorithms for the artificial
neural network learning.
A. Two-Hidden-Layer Hessian for the Artificial
Neural Network Learning
In this article, we use a two-hidden-layer artificial neural
network. This artificial neural network uses hyperbolic tangent
functions in the hidden layer and linear functions in the
output layer. We define the two-hidden-layer artificial neural
network as
dl,k =
j
qlj,k g
i
pji,k g
r
uir,k vr,k
(25)
where pji,k and uir,k are the weights of the two hidden layers,
qlj,k are the weights of the output layer, g(·) are the activation
functions, vr,k are the artificial neural network inputs, dl,k
are the artificial neural network outputs, r is the input layer,
j and i are the hidden layers, l is the output layer, and k is
the iteration.
We consider the two-hidden-layer artificial neural network
shown in Fig. 2. We define pji,k and uir,k as the weights of
the hidden layer and qlj,k as the weights of the output layer.
We define the cost function Ek as
Ek =
1
2
LT
l=1
dl,k − tl,k
2
(26)
where dl,k is the artificial neural network output, tl,k is
the data set target, and LT is the total outputs number.
The second-order partial derivatives of the cost function Ek
with respect to the weights pji,k, uir,k , and qlj,k will be
used to obtain the two-hidden-layer Newton and Levenberg–
Marquardt algorithms.
We consider the forward propagation as
wj,k =
i
uir,k vr,k , ai,k = g
wi,k
z j,k =
i
pji,kai,k, cj,k = g
z j,k
xl,k =
j
qlj,kcj,k, dl,k = f
xl,k
= xl,k (27)
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
6. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
6 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
where ai,k are the artificial neural network inputs and dl,k are
the artificial neural network outputs, pji,k and uir,k are hidden
layer weights, and qlj,k are output layer weights.
We consider the activation functions in the two hidden
layers as the hyperbolic tangent functions
g
wi,k
=
ewi,k
− e−wi,k
ewi,k + e−wi,k
= tanh
wi,k
g
z j,k
=
ez j,k
− e−z j,k
ez j,k + e−z j,k
= tanh
z j,k
. (28)
We consider the activation functions of the output layer as the
linear functions
f
xl,k
= xl,k. (29)
We define the second derivative of Ek as the two-hidden-
layer Hessian Hk [25]–[27]
Hk = ∇∇Ek =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
∂2
E
∂p2
ji
∂2
E
∂pji ∂qlj
∂2
E
∂pji ∂uir
∂2
E
∂pji∂qlj
∂2
E
∂q2
lj
∂2
E
∂qlj ∂uir
∂2
E
∂pji∂uir
∂2
E
∂qlj ∂uir
∂2
E
∂u2
ir
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
. (30)
In the next step, we evaluate the two-hidden-layer Hessian
with the two-hidden-layer Levenberg–Marquardt and Newton
algorithms.
B. Two-Hidden-Layer Newton Algorithm
The two-hidden-layer Newton algorithm constitutes one
alternative to update the weights for the two-hidden-layer
artificial neural network learning. We represent the updating
of the two-hidden-layer Newton algorithm as [1], [2]
⎡
⎣
pji,k+1
qlj,k+1
uir,k+1
⎤
⎦ =
⎡
⎣
pji,k
qlj,k
uir,k
⎤
⎦ − α[Hk]−1
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
∂ Ek
∂pji,k
∂ Ek
∂qlj,k
∂ Ek
∂uir,k
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
Hk =
⎡
⎣
βC,k βE,k βG,k
βE,k βD,k βL,k
βG,k βL,k βF,k
⎤
⎦,
ir
=
i
r
βC,k =
jil
∂2
Ek
∂p2
ji,k
, βD,k =
j
∂2
Ek
∂q2
lj,k
βE,k =
jil
∂2
Ek
∂pji,k∂qlj,k
,
jil
=
j
i
l
βF,k =
ir
∂2
E
∂u2
ir
, βG,k =
jir
∂2
E
∂pji∂uir
βL,k =
jir
∂2
E
∂qlj ∂uir
,
jir
=
j
i
r
(31)
where pji,k, uir , and qlj,k are the weights and α is the learning
factor. The two-hidden-layer Newton algorithm requires the
existence of the inverse in the Hessian ([Hk]−1
).
From (31), we obtain the inverse of Hk as
[Hk]−1
=
⎡
⎣
βC,k βE,k βG,k
βE,k βD,k βL,k
βG,k βL,k βF,k
⎤
⎦
−1
=
1
det[Hk]
⎡
⎢
⎣
βD,k
βF,k
−
βL,k
2
−
βE,k
βF,k
+
βL,k
βG,k
βE,k
βL,k
−
βD,k
βG,k
−
βE,k
βF,k
+
βL,k
βG,k
βC,k
βF,k
−
βG,k
2
−
βC,k
βL,k
+
βG,k
βE,k
βE,k
βL,k
−
βD,k
βG,k
−
βC,k
βL,k
+
βG,k
βE,k
βC,k
βD,k
−
βE,k
2
⎤
⎥
⎦
det[Hk]N = det[Hk]
=
βC,k
βD,k
βF,k
−
βL,k
2
−
βE,k
βE,k
βF,k
−
βL,k
βG,k
+
βG,k
βE,k
βL,k
−
βD,k
βG,k
. (32)
Remark 3: In the two-hidden-layer Newton algorithm
of (32) and (31), we can observe that values of zero
in (βD,k )(βF,k) − (βL,k)2
, (βE,k)(βF,k) − (βL,k)(βG,k), and
(βE,k)(βL,k) − (βD,k )(βG,k) of det[Hk]N are three singularity
points in the learning rates βN ji,k, βNlj,k , and γN,k. The two-
hidden-layer Newton algorithm of (32) and (31) is worse than
the Newton algorithm of (18) and (19) because the Newton
algorithm of (18) and (19) presents one singularity point,
while the two-hidden-layer Newton algorithm of (32) and (31)
presents three singularity points.
C. Two-Hidden-Layer Levenberg–Marquardt Algorithm
The two-hidden-layer Levenberg–Marquardt algorithm
constitutes one alternative to update the weights for the
two-hidden-layer artificial neural network learning. We rep-
resent the basic updating of the two-hidden-layer Levenberg–
Marquardt algorithm as [8]–[11]
⎡
⎣
pji,k+1
qlj,k+1
uir,k+1
⎤
⎦ =
⎡
⎣
pji,k
qlj,k
uir,k
⎤
⎦ − [Hk + αI]−1
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎣
∂ Ek
∂pji,k
∂ Ek
∂qlj,k
∂ Ek
∂uir,k
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎦
Hk =
⎡
⎣
βC,k βE,k βG,k
βE,k βD,k βL,k
βG,k βL,k βF,k
⎤
⎦,
ir
=
i
r
βC,k =
jil
∂2
Ek
∂p2
ji,k
, βD,k =
j
∂2
Ek
∂q2
lj,k
βE,k =
jil
∂2
Ek
∂pji,k∂qlj,k
,
jil
=
j
i
l
βF,k =
ir
∂2
E
∂u2
ir
, βG,k =
jir
∂2
E
∂pji∂uir
βL,k =
jir
∂2
E
∂qlj ∂uir
,
jir
=
j
i
r
(33)
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
7. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
RUBIO: STABILITY ANALYSIS OF THE MODIFIED LEVENBERG–MARQUARDT ALGORITHM 7
where pji,k, uir , and qlj,k are the weights and α is the
learning factor. The two-hidden-layer Levenberg–Marquardt
algorithm requires the existence of the inverse in the Hessian
[Hk + αI]−1
.
From (33), we obtain the inverse of Hk + αI as
[Hk +αI]−1
=
⎡
⎣
α + βC,k βE,k βG,k
βE,k α + βD,k βL,k
βG,k βL,k α + βF,k
⎤
⎦
−1
=
1
det[Hk +αI]
⎡
⎢
⎣
α + βD,k
α + βF,k
−
βL,k
2
−
βE,k
α+βF,k
+
βL,k
βG,k
βE,k
βL,k
−
α + βD,k
βG,k
−
βE,k
α+βF,k
+
βL,k
βG,k
α+βC,k
α+βF,k
−
βG,k
2
−
α+βC,k
βL,k
+
βG,k
βE,k
βE,k
βL,k
−
α+βD,k
βG,k
−
α+βC,k
βL,k
+
βG,k
βE,k
α+βC,k
βD,k
−
βE,k
2
⎤
⎥
⎦
det[Hk]LM = det[Hk + αI]
=
α + βC,k
α + βD,k
α + βF,k
−
βL,k
2
−
βE,k
βE,k
α + βF,k
−
βL,k
βG,k
+
βG,k
βE,k
βL,k
−
α + βD,k
βG,k
.
(34)
Remark 4: In the two-hidden-layer Levenberg–Marquardt
algorithm of (34) and (33), we can observe that values of zero
in (α+βD,k)(α+βF,k)−(βL,k)2
, (βE,k)(α+βF,k)−(βL,k)(βG,k),
and (βE,k)(βL,k) − (α + βD,k)(βG,k) of det[Hk]LM are three
singularity points in the learning rates βLMji,k, βLMlj,k, and
γLM,k. The two-hidden-layer Levenberg–Marquardt algorithm
of (34) and (33) is worse than the Levenberg–Marquardt
algorithm of (23) and (24) because Levenberg–Marquardt
algorithm of (23) and (24) presents one singularity point, while
the two-hidden-layer Levenberg–Marquardt algorithm of (34)
and (33) presents three singularity points.
IV. ERROR STABILITY AND WEIGHTS BOUNDEDNESS
ANALYSIS OF THE MODIFIED LEVENBERG–
MARQUARDT ALGORITHM
In this section, the modified Levenberg–Marquardt algo-
rithm is introduced for the artificial neural network learning,
and the error stability and weights boundedness are analyzed.
A. Modified Levenberg–Marquardt Algorithm
The modified Levenberg–Marquardt algorithm is defined as
pji,k+1 = pji,k − βMLM,k
∂ Ek
∂pji,k
+ γMH,k
∂ Ek
∂qlj,k
qlj,k+1 = qlj,k − βMLM,k
∂ Ek
∂qlj,k
+ γMH,k
∂ Ek
∂pji,k
βMLM,k =
α +
βC,k
2
α +
βD,k
2
det[Hk]MLM
det[Hk]MLM =
α +
βA,k
2
+
βB,k
2
∗
α+
βC,k
2
α +
βD,k
2
+
βE,k
2
βA,k =
ji
∂ Ek
∂pji,k
dl,k − tl,k
, βB,k =
j
∂ Ek
∂qlj,k
dl,k − tl,k
βC,k =
jil
∂2
Ek
∂p2
ji,k
, βD,k =
j
∂2
Ek
∂q2
lj,k
βE,k =
jil
∂2
Ek
∂pji,k∂qlj,k
, γMH,k = 0
jil
=
j
i
l
,
ji
=
j
i
(35)
where
∂ Ek
∂pji,k
(dl,k − tl,k )
= qlj,k g/
(z j,k)ai,k
∂ Ek
∂qlj,k
(dl,k − tl,k )
= g(z j,k)
∂ Ek
∂pji,k
= (dl,k − tl,k )qlj,k g/
(z j,k)ai,k
∂ Ek
∂qlj,k
= g(z j,k)(dl,k − tl,k )
∂2
Ek
∂p2
ji,k
=
a2
i,kqlj,k ∗
−2g(z j,k)g/
(z j,k)(dl,k − tl,k )
+ g/
(z j,k)2
qlj,k
∂2
Ek
∂q2
lj,k
= g(z j,k)2
∂2
Ek
∂pji,k∂qlj,k
= ai,k g/
(z j,k)
(dl,k − tl,k ) + g(z j,k)qlj,k
. (36)
βMLM,k is the learning rate, pji,k and qlj,k are the weights,
α is the learning factor, g(z j,k) = tanh(z j,k) are the activation
functions, and g/
(z j,k) = sec h2
(z j,k) are the derivative of
the activation functions. Equations (35) and (36) describe the
modified Levenberg–Marquardt algorithm.
Remark 5: The modified Levenberg–Marquardt algorithm
of (35) and (36) is based on the Levenberg–Marquardt algo-
rithm of (23) and (24) and on the Newton algorithm of
(18) and (19) but with the following two differences to assure
the error stability and weights boundedness.
1) A value of zero in (βC,k )(βD,k) − (βE,k)2
of det[Hk]N
is a singularity point in the learning rates βN ji,k, βNlj,k,
and γN,k of the Newton algorithm, and a value of zero
in (α + (βC,k ))(α + (βD,k)) − (βE,k)2
of det[Hk]LM
is a singularity point in the learning rates βLMji,k,
βLMlj,k, and γLM,k of the Levenberg–Marquardt algo-
rithm, while there is not a value of zero in ([α +
(βA,k)2
+(βB,k)2
]∗[(α+(βC,k )2
)(α+(βD,k )2
)+(βE,k)2
])
of det[Hk]MLM, and there is not a singularity point in
the learning rate βMLM,k of the modified Levenberg–
Marquardt algorithm.
2) The Levenberg–Marquardt algorithm has three differ-
ent learning rates βLMji,k, βLMlj,k, and γLM,k, and
the Newton algorithm has three different learning
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
8. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
8 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
rates βN ji,k, βNlj,k, and γN,k, while the modified
Levenberg–Marquardt algorithm only has one learning
rate βMLM,k .
The mentioned differences produce that the error stability and
weights boundedness of the modified Levenberg–Marquardt
algorithm will be assured in Section IV-B.
Remark 6: The application of the modified Levenberg–
Marquardt algorithm for the artificial neural network learning
is based on the following steps: 1) obtain the artificial neural
network output dl,k of Fig. 1 with (1) and (3); 2) obtain the
backpropagation of the output layer (∂ Ek/∂qlj,k) with (9),
and the backpropagation of the hidden layer (∂ Ek/∂pji,k)
with (10); and 3) obtain the updating of the weights of the
hidden layer pji,k with (35) and (36) and the weights of the
output layer qlj,k with (35) and (36). Please note that step 3)
represents the artificial neural network learning.
B. Error Stability and Weights Boundedness Analysis
We analyze the error stability of the modified Levenberg–
Marquardt algorithm by the Lyapunov algorithm detailed by
the following theorem.
Theorem 1: The errors of the modified Levenberg–
Marquardt algorithm (1), (3), (35), and (36) applied for the
learning of the data set targets tl,k are uniformly stable, and
the upper bound of the average errors o2
l,k satisfies
lim sup
T →∞
1
T
T
k=2
o2
l,k ≤
2
α
μ2
l (37)
where o2
l,k = (1/2)βMLM,k−1(dl,k−1 − tl,k−1)2
, 0 α ≤ 1 ∈ ,
and 0 βMLM,k ∈ are in (35), (dl,k−1 − tl,k−1) are the
errors, μl are the upper bounds of the uncertainties μl,k, and
|μl,k| μl.
Proof: Define the next positive function
l,k =
1
2
βMLM,k−1
dl,k−1 −tl,k−1
2
+
ji
p2
ji,k +
j
q2
lj,k (38)
where
pji,k and
qlj,k are in (35), (36). Then, l,k is
l,k =
1
2
βMLM,k
dl,k − tl,k
2
+
ji
p2
ji,k+1 +
j
q2
lj,k+1
−
1
2
βMLM,k−1
dl,k−1 − tl,k−1
2
−
ji
p2
ji,k −
j
q2
lj,k.
(39)
Now, the weights errors are as
ji
p2
ji,k+1 =
ji
p2
ji,k − 2βMLM,k
∂ Ek
∂pji,k
ji
pji,k
+ β2
MLM,k
∂ Ek
∂pji,k
2
ji
q2
lj,k+1 =
ji
q2
lj,k − 2βMLM,k
∂ Ek
∂qlj,k
j
qlj,k
+ β2
MLM,k
∂ Ek
∂qlj,k
2
. (40)
Substituting (40) into (39) is
l,k = −2βMLM,k
∂ Ek
∂pji,k
ji
pji,k + β2
MLM,k
∂ Ek
∂pji,k
2
− 2βMLM,k
∂ Ek
∂qlj,k
j
qlj,k + β2
MLM,k
∂ Ek
∂qlj,k
2
+
1
2
βMLM,k
dl,k −tl,k
2
−
1
2
βMLM,k−1
dl,k−1 −tl,k−1
2
.
(41)
Equation (41) is rewritten as
l,k =
1
2
βMLM,k
dl,k − tl,k
2
−
1
2
βMLM,k−1
dl,k−1 − tl,k−1
2
− 2βMLM,k
⎡
⎣ ∂ Ek
∂pji,k
ji
pji,k +
∂ Ek
∂qlj,k
j
qlj,k
⎤
⎦
+ β2
MLM,k
∂ Ek
∂pji,k
2
+
∂ Ek
∂qlj,k
2
. (42)
Using the closed-loop dynamics ((∂ Ek/∂pji,k)/(dl,k −
tl,k ))
ji
pji,k + ((∂ Ek/∂qlj,k)/(dl,k − tl,k ))
j
qlj,k = (dl,k −
tl,k ) − μl,k of [31] and [33] in the second element of (42),
it can be seen that
∂ Ek
∂pji,k
ji
pji,k +
∂ Ek
∂qlj,k
j
qlj,k
=
dl,k − tl,k
⎡
⎣
∂ Ek
∂pji,k
dl,k − tl,k
ji
pji,k +
∂ Ek
∂qlj,k
dl,k −tl,k
j
qlj,k
⎤
⎦
=
dl,k − tl,k
dl,k − tl,k
− μl,k
(43)
where μl,k are the uncertainties. Substituting (43) in the second
element of (42) is
l,k =
1
2
βMLM,k
dl,k − tl,k
2
−
1
2
βMLM,k−1
dl,k−1 − tl,k−1
2
− 2βMLM,k
dl,k − tl,k
dl,k − tl,k
− μl,k
+ β2
MLM,k
⎡
⎣
⎛
⎝
ji
∂ Ek
∂pji,k
⎞
⎠
2
+
⎛
⎝
j
∂ Ek
∂qlj,k
⎞
⎠
2⎤
⎦
l,k =
1
2
βMLM,k
dl,k − tl,k
2
−
1
2
βMLM,k−1
dl,k−1 − tl,k−1
2
− 2βMLM,k
dl,k − tl,k
2
+ 2βMLM,k
dl,k − tl,k
μl,k
+ β2
MLM,k
dl,k − tl,k
2
βA,k
2
+
βB,k
2
(44)
where βA,k =
ji ((∂ Ek/∂pji,k)/(dl,k − tl,k )) and βB,k =
j ((∂ Ek/∂qlj,k)/(dl,k − tl,k )). Substituting βMLM,k of (35)
into the element β2
MLM,k(dl,k − tl,k )2
[(βA,k)2
+ (βB,k)2
] and
considering α ≤ 1 is given in (45), as shown at the bottom of
the next page. In (45), βA,k =
ji ((∂ Ek/∂pji,k)/(dl,k − tl,k ))
and βB,k =
j ((∂ Ek/∂qlj,k)/(dl,k −tl,k)). Taking in to account
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
9. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
RUBIO: STABILITY ANALYSIS OF THE MODIFIED LEVENBERG–MARQUARDT ALGORITHM 9
that 2βMLM,k (dl,k − tl,k)μl,k ≤ (1/2)βMLM,k(dl,k − tl,k )2
+
2βMLM,kμ2
l,k and employing (45) in (44) gives
l,k ≤
1
2
βMLM,k
dl,k − tl,k
2
−
1
2
βMLM,k−1
dl,k−1 − tl,k−1
2
− 2βMLM,k
dl,k − tl,k
2
+
1
2
βMLM,k
dl,k − tl,k
2
+ 2βMLM,kμ2
l,k + βMLM,k
dl,k − tl,k
2
l,k ≤ −
1
2
βMLM,k−1
dl,k−1 − tl,k−1
2
+ 2βMLM,k μ2
l,k. (46)
From (35)
βMLM,k
=
α+
βC,k
2
α+
βD,k
2
α+
βA,k
2
+
βB,k
2
α+
βC,k
2
α+
βD,k
2
+
βE,k
2
≤
1
α
. (47)
Employing (47) and |μl,k| ≤ μl in (46) gives
l,k ≤ −
1
2
βMLM,k−1
dl,k−1 − tl,k−1
2
+
2
α
μ2
l . (48)
Employing (48), the errors of the modified Levenberg–
Marquardt are uniformly stable. Hence, l,k is bounded.
Taking into account (48) and o2
l,k of (37), it is
l,k ≤ −o2
l,k +
2
α
μ2
l . (49)
Summarizing (49) from 2 to T is
T
k=2
o2
l,k −
2
α
μ2
l
≤ l,1 − l,T . (50)
Employing that l,T 0 is bounded
1
T
T
k=2
o2
l,k ≤
2
α
μ2
l +
1
T
l,1 ⇒ lim sup
T→∞
1
T
T
k=2
o2
l,k ≤
2
α
μ2
l .
(51)
Equation (51) is similar to (37).
Remark 7: The result of Theorem 1 that the errors of
the modified Levenberg–Marquardt algorithm for the artificial
neural network learning are assured to be stable produces
that the artificial neural network outputs dl,k of the modified
Levenberg–Marquardt algorithm remain bounded during all
the training and testing.
The following theorem proves the weights boundedness of
the modified Levenberg–Marquardt.
Theorem 2: When the average errors o2
l,k+1 are bigger than
the uncertainties (2/α)μ2
l , the weights errors are bounded by
the initial weights errors as
o2
l,k+1 ≥
2
α
μ2
l ⇒
ji
p2
ji,k+1 +
j
q2
lj,k+1 ≤
ji
p2
ji,1+
j
q2
lj,1
(52)
where
p2
ji,k+1 and
q2
lj,k+1 are the weights,
p2
ji,1 and
q2
lj,1 are the
initial weights, o2
l,k+1 = (1/2)βMLM,k(dl,k −tl,k )2
and (dl,k−1 −
tl,k−1) are the errors, and 0 α ≤ 1 ∈ , 0 βMLM,k ∈ ,
and μl are the upper bounds of the uncertainties μl,k ,
|μl,k| μl.
Proof: From (40), the weights are written as
ji
p2
ji,k+1 =
ji
p2
ji,k − 2βMLM,k
∂ Ek
∂pji,k
ji
pji,k
+ β2
MLM,k
∂ Ek
∂pji,k
2
ji
q2
lj,k+1 =
ji
q2
lj,k − 2βMLM,k
∂ Ek
∂qlj,k
j
qlj,k
+ β2
MLM,k
∂ Ek
∂qlj,k
2
. (53)
Adding
ji
p2
ji,k+1 with
ji
q2
lj,k+1 of (53) gives
ji
p2
ji,k+1 +
ji
q2
lj,k+1
=
ji
p2
ji,k +
ji
q2
lj,k
− 2βMLM,k
∂ Ek
∂pji,k
ji
pji,k + β2
MLM,k
∂ Ek
∂pji,k
2
− 2βMLM,k
∂ Ek
∂qlj,k
j
qlj,k + β2
MLM,k
∂ Ek
∂qlj,k
2
. (54)
Equation (54) is represented as
ji
p2
ji,k+1 +
ji
q2
lj,k+1
=
ji
p2
ji,k +
ji
q2
lj,k
− 2βMLM,k
⎡
⎣ ∂ Ek
∂pji,k
ji
pji,k +
∂ Ek
∂qlj,k
j
qlj,k
⎤
⎦
+ β2
MLM,k
∂ Ek
∂pji,k
2
+
∂ Ek
∂qlj,k
2
. (55)
β2
MLM,k
dl,k − tl,k
2
βA,k
2
+
βB,k
2
= βMLM,k
βA,k
2
+
βB,k
2
βMLM,k
dl,k − tl,k
2
=
⎛
⎝
βA,k
2
+
βB,k
2
α+
βC,k
2
α+
βD,k
2
α+
βA,k
2
+
βB,k
2
α+
βC,k
2
α+
βD,k
2
+
βE,k
2
∗ βMLM,k
dl,k − tl,k
2
⎞
⎠
≤ βMLM,k
dl,k − tl,k
2
. (45)
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
10. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
10 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Substituting (∂ Ek/∂pji,k)
ji
pji,k + (∂ Ek/∂qlj,k)
j
qlj,k =
(dl,k − tl,k )[(dl,k − tl,k ) − μl,k ] of (43) in the second element
of (55) gives
ji
p2
ji,k+1 +
ji
q2
lj,k+1
=
ji
p2
ji,k +
ji
q2
lj,k
− 2βMLM,k
dl,k − tl,k
dl,k − tl,k
− μl,k
+ β2
MLM,k
∂ Ek
∂pji,k
2
+
∂ Ek
∂qlj,k
2
ji
p2
ji,k+1 +
ji
q2
lj,k+1
=
ji
p2
ji,k +
ji
q2
lj,k
− 2βMLM,k
dl,k − tl,k
2
+ 2βMLM,k
dl,k − tl,k
μl,k
+ β2
MLM,k
dl,k − tl,k
2
βA,k
2
+
βB,k
2
(56)
where μl,k are the uncertainties, βA,k =
ji ((∂ Ek/∂pji,k)/
(dl,k − tl,k )), and βB,k =
j ((∂ Ek/∂qlj,k)/(dl,k − tl,k)).
Substituting 2βMLM,k(dl,k − tl,k )μl,k ≤ (1/2)βMLM,k(dl,k −
tl,k )2
+ 2βMLM,k μ2
l,k into the third element of (56) and
β2
MLM,k(dl,k − tl,k)2
[(βA,k)2
+ (βB,k)2
] ≤ βMLM,k (dl,k − tl,k )2
of (45) into the last element of (56) give
ji
p2
ji,k+1 +
ji
q2
lj,k+1
=
ji
p2
ji,k +
ji
q2
lj,k
− 2βMLM,k
dl,k − tl,k
2
+
1
2
βMLM,k
dl,k − tl,k
2
+ 2βMLM,kμ2
l,k + βMLM,k
dl,k − tl,k
2
ji
p2
ji,k+1 +
ji
q2
lj,k+1
=
ji
p2
ji,k +
ji
q2
lj,k
−
1
2
βMLM,k
dl,k − tl,k
2
+ 2βMLM,kμ2
l,k . (57)
From (47), βMLM,k ≤ (1/α), and using |μl,k | ≤ μl in (57)
gives
ji
p2
ji,k+1 +
ji
q2
lj,k+1 =
ji
p2
ji,k +
ji
q2
lj,k
−
1
2
βMLM,k
dl,k − tl,k
2
+
2
α
μ2
l . (58)
Taking into account o2
l,k+1 = (1/2)βMLM,k(dl,k − tl,k )2
is
o2
l,k+1 ≥
2
α4
μ2
l ⇒
ji
p2
ji,k+1 +
ji
q2
lj,k+1 ≤
ji
p2
ji,k +
ji
q2
lj,k.
(59)
Taking into account that o2
l,k+1 ≥ (2/α)μ2
l for k ∈ [1, k] is
true, hence
ji
p2
ji,k+1 +
ji
q2
lj,k+1 ≤
ji
p2
ji,k +
ji
q2
lj,k
≤ · · · ≤
ji
p2
ji,1 +
ji
q2
lj,1. (60)
Then, (52) is proven.
Remark 8: The result of Theorem 2 that the weights of
the modified Levenberg–Marquardt algorithm are bounded
produces that the hidden layer weights pji,k and output layer
weights qlj,k of the modified Levenberg–Marquardt algorithm
for the artificial neural network learning remain bounded
during all the training and testing.
V. RESULTS
In this section, we compare the Newton algorithm (N) of (1),
(3), (18), (19), and [1] and [2], the Levenberg–Marquardt
algorithm (LM) of (1), (3), (23), (24), and [8]–[11], and the
modified Levenberg–Marquardt algorithm (MLM) of (1), (3),
(35), and (36) for the artificial neural network learning of
electric signal data set because they are based on the Hessian,
and we compare the stable gradient algorithm in a neural
network (SGNN) of [31] and [32], the stable gradient algo-
rithm in a radial basis function neural network (SGRBFNN)
of [33], [34], and the modified Levenberg–Marquardt algo-
rithm (MLM) of (1), (3), (35), and (36) for the artificial
neural network learning of brain signal data set because they
are based on the stability. The objective of N, LM, SGNN,
SGRBFNN, and MLM is that the artificial neural network
outputs dl,k must follow the data set targets tl,k as near as
possible.
In this part of this article, the abovementioned algorithms
are applied for the artificial neural network learning con-
taining the training and testing stages. The root-mean-square
error (RMSE) is utilized to show the performance accuracy
for the comparisons, and it is represented as
E =
1
T
T
k=1
LT
l
dl,k − tl,k
2
1
2
(61)
where dl,k − tl,k are the errors, dl,k are the artificial neural
network outputs, tl,k is the data set targets, LT is the total
outputs number, and T is the final iteration.
A. Electric Signals
The electric signal data set information is obtained from
Electricity Load and Price Forecasting with MATLAB where
the details are explained in [35]. The electric signal data
set is the history of electric energy usage at each hour and
temperature observations of the International Organization for
Standardization (ISO) of Great Britain. The meteorological
information includes the temperature of the dry bulb and the
dew point, taking into account the electric signal data set of
the hourly electric energy usage called an electric signal.
In the electric signal data set, we consider eight inputs
described as follows: a1,k is the temperature of the dry bulb,
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
11. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
RUBIO: STABILITY ANALYSIS OF THE MODIFIED LEVENBERG–MARQUARDT ALGORITHM 11
Fig. 3. Training for the first electric signal data set.
a2,k is the dew point, a3,k is the hour of the day, a4,k is the
day of the week, a5,k is a mark indicating if this is a free or a
weekend day, a6,k is the medium load of the past day, a7,k is
the load of the same hour, in the past day, and a8,k is the load
of the same hour and day of the past week, and we consider 1
target described as follows: t1,k is the load of the same day.
In the artificial neural network learning, we consider eight
artificial neural network inputs denoted as a1,k, a2,k, a3,k, a4,k,
a5,k, a6,k, a7,k, and a8,k that are the same inputs of the electric
signal data set, and we consider one artificial neural network
output denoted as d1,k. We utilize 7000 iterations of the data
set for the artificial neural network training, and we utilize
1000 iterations of the data set for the artificial neural network
testing. The objective of N, LM, and MLM is that the artificial
neural network output d1,k must follow the target t1,k as near
as possible.
The N of [1] and [2] is detailed as (1), (3), (18), and (19)
with eight inputs, one output, and five neurons in the hidden
layer, α = 0.9, pji,1 = rand, qlj,1 = rand, and rand is a
random number between 0 and 1.
The LM of [8]–[11] is detailed as (1), (3), (23), and (24)
with eight inputs, one output, and five neurons in the hidden
layer, α = 0.9, pji,1 = rand, qlj,1 = rand, and rand is a
random number between 0 and 1.
The MLM is detailed as (1), (3), (35), and (36), with
eight inputs, one output, and five neurons in the hidden layer,
α = 0.9, pji,1 = rand, qlj,1 = rand, and rand is a random
number between 0 and 1.
The comparisons for the training and testing of the N, LM,
and MLM for the first electric signal data set are shown in
Figs. 3 and 4. The weights of the MLM for the first electric
signal data set are shown in Figs. 5 and 6. The comparisons
for the training and testing of the N, LM, and MLM for the
second electric signal data set are shown in Figs. 7 and 8. The
weights of the MLM for the second electric signal data set
are shown in Figs. 9 and 10. The training and testing RMSE
comparisons of the performance accuracy (61) for the first
electric signal data set are shown in Table I and, for the second
electric signal data set, are shown in Table II. Please note that
the most important data are related to the output d1,k.
To improve the training and testing, more neurons in the hid-
den layer could be included; nevertheless, this decision could
increase the computational cost. From Figs. 3, 4, 7, and 8,
Fig. 4. Testing for the first electric signal data set.
Fig. 5. Hidden layer weights for the first electric signal data set.
Fig. 6. Output layer weights for the first electric signal data set.
TABLE I
RMSE FOR THE FIRST ELECTRIC SIGNAL DATA SET
it is observed that the MLM improves the LM and N because
the signal of the MLM follows better the electric signal data set
than the other. From Figs. 5, 6, 9, and 10, it is observed that the
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
12. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
12 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Fig. 7. Training for the second electric signal data set.
Fig. 8. Testing for the second electric signal data set.
Fig. 9. Hidden layer weights for the second electric signal data set.
TABLE II
RMSE FOR THE SECOND ELECTRIC SIGNAL DATA SET
weights of the MLM remain bounded. From Tables I and II,
it is observed that the MLM achieves better performance
accuracy for training and testing compared with LM and N
Fig. 10. Output layer weights for the second electric signal data set.
because the RMSE is the smallest for the MLM. Thus, MLM
is the best option for learning in the electric signal data set.
B. Brain Signals
The brain signal data set information is obtained from our
laboratory where the details are explained in [36]. The brain
signal data set is the real data of brain signals. The alpha signal
is obtained in this study because it has more probabilities to
be found. The acquisition system is applied with a 28-year old
healthy man when his eyes are closed. There are four different
signals received by the brain signals.
In the brain signal data set, we consider three inputs
described as follows: a1,k is the brain signal of the focal
point 1, a2,k is the brain signal of the focal point 2, and a3,k
is the brain signal of the focal point 3, and we consider 1
target described as follows: t1,k is the brain signal of the focal
point 4.
In the artificial neural network learning, we consider three
artificial neural network inputs denoted as a1,k, a2,k, and a3,k
that are the same inputs of the brain signal data set, and we
consider one artificial neural network output denoted as d1,k.
We utilize 7000 iterations of the data set for the artificial
neural network training, and we utilize 1000 iterations of the
data set for the artificial neural network testing. The objective
of SGNN, SGRBFNN, and MLM is that the artificial neural
network output d1,k must follow the target t1,k as near as
possible.
The SGNN of [31] and [32] is detailed with three inputs,
one output, and five neurons in the hidden layer, α = 0.9,
pji,1 = rand, qlj,1 = rand, and rand is a random number
between 0 and 1.
The SGRBFNN of [33] and [34] is detailed with three
inputs, one output, and five neurons in the hidden layer,
α = 0.9, pji,1 = rand, qlj,1 = rand, and rand is a random
number between 0 and 1.
The MLM is detailed as (1), (3), (35), and (36) with three
inputs, one output, and five neurons in the hidden layer,
α = 0.9, pji,1 = rand, qlj,1 = rand, and rand is a random
number between 0 and 1.
The comparisons for the training and testing of the SGNN,
SGRBFNN, and MLM for the first brain signal data set are
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
13. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
RUBIO: STABILITY ANALYSIS OF THE MODIFIED LEVENBERG–MARQUARDT ALGORITHM 13
Fig. 11. Training for the first brain signal data set.
Fig. 12. Testing for the first brain signal data set.
Fig. 13. Hidden layer weights for the first brain signal data set.
shown in Figs. 11 and 12. The weights of the MLM for
the first brain signal data set are shown in Figs. 13 and 14.
The comparisons for the training and testing of the SGNN,
SGRBFNN, and MLM for the second brain signal data set
in Figs. 15 and 16. The weights of the MLM for the second
brain signal data set in Figs. 17 and 18. The training and
testing RMSE comparisons of the performance accuracy (61)
for the first brain signal data set are shown in Table III and, for
the second brain signal data set, are shown in Table IV. Please
note that the most important data are related to the output d1,k.
To improve the training and testing, more neurons in the hid-
den layer could be included; nevertheless, this decision could
increase the computational cost. From Figs. 11, 12, 15, and 16,
Fig. 14. Output layer weights for the first brain signal data set.
Fig. 15. Training for the second brain signal data set.
TABLE III
RMSE FOR THE FIRST BRAIN SIGNAL DATA SET
Fig. 16. Testing for the second brain signal data set.
it is observed that the MLM improves the SGRBFNN and
SGNN because the signal of the MLM follows better the brain
signal data set than the other. From Figs. 13, 14, 17, and 18,
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
14. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
14 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Fig. 17. Hidden layer weights for the second brain signal data set.
Fig. 18. Output layer weights for the second brain signal data set.
TABLE IV
RMSE FOR THE SECOND BRAIN SIGNAL DATA SET
it is observed that the weights of the MLM remain bounded.
From Table IV, it is observed that the MLM achieves better
performance accuracy for training and testing compared with
SGRBFNN and SGNN because the RMSE is the smallest for
the MLM. Thus, the MLM is the best option for learning in
the brain signal data set.
Remark 9: The result of Theorem 1 that the error of the
MLM is assured to be stable, while the error some of the N,
LM, SGNN, and SGRBFNN are not assured to be stable can
be observed mainly in the training of Figs. 3, 7, 11, and 15 and
in the testing of Figs. 4, 8, 12, and 16, where the signals of
the N, LM, and SGNN are unbounded during the training or
testing, while the signal of the MLM remains bounded during
all the training and testing.
Remark 10: The result of Theorem 2 that the weights of
the MLM are bounded can be observed mainly in the hidden
layer weights of Figs. 5, 9, 13, and 17 and in the output layer
weights of Figs. 6, 10, 14, and 18, where the weights of the
MLM remain bounded during all the training. The weights of
the MLM also remain bounded during all the testing because
they take the last value obtained during the training.
VI. CONCLUSION
The objective of this article is to introduce an algorithm
called modified Levenberg–Marquardt for the artificial neural
network learning. The modified Levenberg–Marquardt was
compared with the Newton, Levenberg–Marquardt, and stable
gradient algorithms for learning of the electric and brain signal
data set, resulting in that we obtained the best performance
accuracy with the modified Levenberg–Marquardt because we
obtained the nearest following of the artificial neural network
output to the data set target and because we obtained the
smallest value in the RMSE. In the forthcoming work, we will
propose other algorithms for the artificial neural network
learning to compare with our results, or we will apply our
algorithm for the learning of other robotic or mechatronic
systems.
ACKNOWLEDGMENT
The author is grateful for the Editor-in-Chief, Associate Edi-
tor, and Reviewers for their valuable comments and insightful
suggestions that helped to improve this research significantly.
He would also like to thank the Instituto Politécnico Nacional,
the Secretaría de Investigación y Posgrado, the Comisión de
Operación y Fomento de Actividades Académicas, and the
Consejo Nacional de Ciencia y Tecnología for their help in
this research.
REFERENCES
[1] S. Kostić and D. Vasović, “Prediction model for compressive strength of
basic concrete mixture using artificial neural networks,” Neural Comput.
Appl., vol. 26, no. 5, pp. 1005–1024, Jul. 2015.
[2] B. Sahoo and P. K. Bhaskaran, “Prediction of storm surge and inundation
using climatological datasets for the indian coast using soft computing
techniques,” Soft Comput., vol. 23, no. 23, pp. 12363–12383, Dec. 2019.
[3] T.-L. Le, “Intelligent fuzzy controller design for antilock braking sys-
tems,” J. Intell. Fuzzy Syst., vol. 36, no. 4, pp. 3303–3315, Apr. 2019.
[4] C. Yin, S. Wu, S. Zhou, J. Cao, X. Huang, and Y. Cheng, “Design
and stability analysis of multivariate extremum seeking with Newton
method,” J. Franklin Inst., vol. 355, no. 4, pp. 1559–1578, Mar. 2018.
[5] S. Chakia, B. Shanmugarajanb, S. Ghosalc, and G. Padmanabham,
“Application of integrated soft computing techniques for optimisation of
hybrid CO2 laser–MIG welding process,” Appl. Soft Comput., vol. 30,
pp. 365–374, May 2015.
[6] Y. Li, H. Zhang, J. Han, and Q. Sun, “Distributed multi-agent opti-
mization via event-triggered based continuous-time Newton–Raphson
algorithm,” Neurocomputing, vol. 275, pp. 1416–1425, Jan. 2018.
[7] M. S. Salim and A. I. Ahmed, “A quasi-Newton augmented lagrangian
algorithm for constrained optimization problems,” J. Intell. Fuzzy Syst.,
vol. 35, no. 2, pp. 2373–2382, Aug. 2018.
[8] C. Lv et al., “Levenberg–arquardt backpropagation training of multilayer
neural networks for state estimation of a safety-critical cyber-physical
system,” IEEE Trans. Ind. Informat., vol. 14, no. 8, pp. 3436–3446,
Aug. 2018.
[9] M. J. Rana, M. S. Shahriar, and M. Shafiullah, “Levenberg–Marquardt
neural network to estimate UPFC-coordinated PSS parameters to
enhance power system stability,” Neural Comput. Appl., vol. 31,
pp. 1237–1248, Jul. 2019.
[10] A. Sarabakha, N. Imanberdiyev, E. Kayacan, M. A. Khanesar, and
H. Hagras, “Novel Levenberg–Marquardt based learning algorithm for
unmanned aerial vehicles,” Inf. Sci., vol. 417, pp. 361–380, Nov. 2017.
[11] J. S. Smith, B. Wu, and B. M. Wilamowski, “Neural network training
with Levenberg–Marquardt and adaptable weight compression,” IEEE
Trans. Neural Netw. Learn. Syst., vol. 30, no. 2, pp. 580–587, Feb. 2019.
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.
15. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
RUBIO: STABILITY ANALYSIS OF THE MODIFIED LEVENBERG–MARQUARDT ALGORITHM 15
[12] H. G. Han, Y. Li, Y. N. Guo, and J. F. Qiao, “A soft computing method to
predict sludge volume index based on a recurrent self-organizing neural
network,” Appl. Soft Comput., vol. 38, pp. 477–486, Jan. 2016.
[13] J. Qiao, L. Wang, C. Yang, and K. Gu, “Adaptive Levenberg-Marquardt
algorithm based echo state network for chaotic time series prediction,”
IEEE Access, vol. 6, pp. 10720–10732, 2018.
[14] A. Parsaie, A. H. Haghiabi, M. Saneie, and H. Torabi, “Applica-
tions of soft computing techniques for prediction of energy dissipa-
tion on stepped spillways,” Neural Comput. Appl., vol. 29, no. 12,
pp. 1393–1409, Jun. 2018.
[15] N. Zhang and D. Shetty, “An effective LS-SVM-based approach for
surface roughness prediction in machined surfaces,” Neurocomputing,
vol. 198, pp. 35–39, Jul. 2016.
[16] E. Esme and B. Karlik, “Fuzzy c-means based support vector machines
classifier for perfume recognition,” Appl. Soft Comput., vol. 46,
pp. 452–458, Sep. 2016.
[17] P. Fergus, I. Idowu, A. Hussain, and C. Dobbins, “Advanced artificial
neural network classification for detecting preterm births using EHG
records,” Neurocomputing, vol. 188, pp. 42–49, May 2016.
[18] A. Narang, B. Batra, A. Ahuja, J. Yadav, and N. Pachauri, “Classifica-
tion of EEG signals for epileptic seizures using Levenberg-Marquardt
algorithm based multilayer perceptron neural network,” J. Intell. Fuzzy
Syst., vol. 34, no. 3, pp. 1669–1677, Mar. 2018.
[19] J. Dong, K. Lu, J. Xue, S. Dai, R. Zhai, and W. Pan, “Accelerated non-
rigid image registration using improved Levenberg–Marquardt method,”
Inf. Sci., vol. 423, pp. 66–79, Jan. 2018.
[20] J. Li, W. X. Zheng, J. Gu, and L. Hua, “Parameter estimation algorithms
for Hammerstein output error systems using Levenberg–Marquardt opti-
mization method with varying interval measurements,” J. Franklin Inst.,
vol. 354, pp. 316–331, Jan. 2017.
[21] X. Yang, B. Huang, and H. Gao, “A direct maximum likelihood
optimization approach to identification of LPV time-delay systems,”
J. Franklin Inst., vol. 353, no. 8, pp. 1862–1881, May 2016.
[22] I. S. Baruch, V. A. Quintana, and E. P. Reynaud, “Complex-valued neural
network topology and learning applied for identification and control of
nonlinear systems,” Neurocomputing, vol. 233, pp. 104–115, Apr. 2017.
[23] M. Kaminski and T. Orlowska-Kowalska, “An on-line trained neural
controller with a fuzzy learning rate of the Levenberg–Marquardt
algorithm for speed control of an electrical drive with an elastic joint,”
Appl. Soft Comput., vol. 32, pp. 509–517, Jul. 2015.
[24] S. Roshan, Y. Miche, A. Akusok, and A. Lendasse, “Adaptive and online
network intrusion detection system using clustering and extreme learning
machines,” J. Franklin Inst., vol. 355, no. 4, pp. 1752–1779, Mar. 2018.
[25] C. Bishop, “Exact calculation of the hessian matrix for the multilayer
perceptron,” Neural Comput., vol. 4, no. 4, pp. 494–501, Jul. 1992.
[26] C. M. Bishop, “A fast procedure for retraining the multilayer percep-
tron,” Int. J. Neural Syst., vol. 2, no. 3, pp. 229–236, 1991.
[27] C. M. Bishop, “Curvature-driven smoothing in feedforward networks,”
in Proc. Seattle Int. Joint Conf. Neural Netw. (IJCNN), 1990, p. 749.
[28] G. Cybenko, “Approximation by superpositions of a sigmoidal function,”
Math. Control, Signals, Syst., vol. 2, no. 4, pp. 303–314, Dec. 1989.
[29] R. B. Ash, Real Analysis and Probability. New York, NY, USA:
Academic, 1972.
[30] J. S. R. Jang and C. T. Sun, Neuro-Fuzzy and Soft Computing. Upper
Saddle River, NJ, USA: Prentice-Hall, 1996.
[31] J. de Jesús Rubio, P. Angelov, and J. Pacheco, “Uniformly stable
backpropagation algorithm to train a feedforward neural network,” IEEE
Trans. Neural Netw., vol. 22, no. 3, pp. 356–366, Mar. 2011.
[32] W. Yu and X. Li, “Discrete-time neuro identification without robust mod-
ification,” IEE Proc.-Control Theory Appl., vol. 150, no. 3, pp. 311–316,
May 2003.
[33] J. D. J. Rubio, I. Elias, D. R. Cruz, and J. Pacheco, “Uniform stable
radial basis function neural network for the prediction in two mecha-
tronic processes,” Neurocomputing, vol. 227, pp. 122–130, Mar. 2017.
[34] J. D. J. Rubio, “USNFIS: Uniform stable neuro fuzzy inference system,”
Neurocomputing, vol. 262, pp. 57–66, Nov. 2017.
[35] I. Elias et al., “Genetic algorithm with radial basis mapping network
for the electricity consumption modeling,” Appl. Sci., vol. 10, no. 12,
p. 4239, Jun. 2020.
[36] J. D. J. Rubio, D. M. Vázquez, and D. Mújica-Vargas, “Acquisition
system and approximation of brain signals,” IET Sci., Meas. Technol.,
vol. 7, no. 4, pp. 232–239, Jul. 2013.
José de Jesús Rubio (Member, IEEE) is currently
a full-time Professor with the Sección de Estudios
de Posgrado e Investigación, ESIME Azcapotzalco,
Instituto Politécnico Nacional, Ciudad de México,
Mexico. He has published over 142 international
journal articles with 2214 cites from Scopus. He has
been the tutor of four Ph.D. students, 20 Ph.D.
students, 42 M.S. students, 4 S. students, and 17 B.S.
students.
Dr. Rubio was a Guest Editor of Neurocomputing,
Applied Soft Computing, Sensors, The Journal of
Supercomputing, Computational Intelligence and Neuroscience, Frontiers in
Psychology, and the Journal of Real-Time Image Processing. He also serves as
an Associate Editor for the IEEE TRANSACTIONS ON NEURAL NETWORKS
AND LEARNING SYSTEMS, the IEEE TRANSACTIONS ON FUZZY SYSTEMS,
Neural Computing and Applications, Frontiers in Neurorobotics, and Mathe-
matical Problems in Engineering.
Authorized licensed use limited to: Cornell University Library. Downloaded on August 20,2020 at 17:32:14 UTC from IEEE Xplore. Restrictions apply.