This article presents a promising new gradient-based backpropagation algorithm for multi-layer
feedforward networks. The method requires no manual selection of global hyperparameters and is capable
of dynamic local adaptations using only first-order information at a low computational cost. Its semistochastic nature makes it fit for mini-batch training and robust to different architecture choices and data
distributions. Experimental evidence shows that the proposed algorithm improves training in terms of both
convergence rate and speed as compared with other well known techniques.
A review of literature shows that there is a variety of works studying coverage path planning in several autonomous robotic applications. In this work, we propose a new approach using Markov Decision Process to plan an optimum path to reach the general goal of exploring an unknown environment containing buried mines. This approach, called Goals to Goals Area Coverage on-line Algorithm, is based on a decomposition of the state space into smaller regions whose states are considered as goals with the same reward value, the reward value is decremented from one region to another according to the desired search mode. The numerical simulations show that our approach is promising for minimizing the necessary cost-energy to cover the entire area.
Comparison of Emergency Medical Services Delivery Performance using Maximal C...IJECEIAES
Ambulance location is one of the critical factors that determine the efficiency of emergency medical services delivery. Maximal Covering Location Problem is one of the widely used ambulance location models. However, its coverage function is considered unrealistic because of its ability to abruptly change from fully covered to uncovered. On the contrary, Gradual Cover Location Problem coverage is considered more realistic compared to Maximal Cover Location Problem because the coverage decreases over distance. This paper examines the delivery of Emergency Medical Services under the models of Maximal Covering Location Problem and Gradual Cover Location Problem. The results show that the latter model is superior, especially when the Maximal Covering Location Problem has been deemed fully covered.
A COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTIONIJCSEA Journal
Stock market price index prediction is a challenging task for investors and scholars. Artificial neural networks have been widely employed to predict financial stock market levels thanks to their ability to model nonlinear functions. The accuracy of backpropagation neural networks trained with different heuristic and numerical algorithms is measured for comparison purpose. It is found that numerical algorithm outperform heuristic techniques.
In this work, we propose to apply trust region optimization to deep reinforcement
learning using a recently proposed Kronecker-factored approximation to
the curvature. We extend the framework of natural policy gradient and propose
to optimize both the actor and the critic using Kronecker-factored approximate
curvature (K-FAC) with trust region; hence we call our method Actor Critic using
Kronecker-Factored Trust Region (ACKTR). To the best of our knowledge, this
is the first scalable trust region natural gradient method for actor-critic methods.
It is also a method that learns non-trivial tasks in continuous control as well as
discrete control policies directly from raw pixel inputs. We tested our approach
across discrete domains in Atari games as well as continuous domains in the MuJoCo
environment. With the proposed methods, we are able to achieve higher
rewards and a 2- to 3-fold improvement in sample efficiency on average, compared
to previous state-of-the-art on-policy actor-critic methods. Code is available at
https://github.com/openai/baselines.
Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective OptimizationeArtius, Inc.
Hybrid Multi-Gradient Explorer (HMGE) algorithm for global multi-objective
optimization of objective functions considered in a multi-dimensional domain is presented. The proposed hybrid algorithm relies on genetic variation operators for creating new solutions, but in addition to a standard random mutation operator, HMGE
uses a gradient mutation operator, which improves convergence. Thus, random mutation helps find global Pareto frontier, and gradient mutation improves convergence to the
Pareto frontier. In such a way HMGE algorithm combines advantages of both
gradient-based and GA-based optimization techniques: it is as fast as a pure gradient-based MGE algorithm, and is able to find the global Pareto frontier similar to genetic algorithms
(GA). HMGE employs Dynamically Dimensioned Response Surface Method (DDRSM) for calculating gradients. DDRSM dynamically recognizes the most significant design variables, and builds local approximations based only on the variables. This allows one to
estimate gradients by the price of 4-5 model evaluations without significant loss of accuracy. As a result, HMGE efficiently optimizes highly non-linear models with dozens and hundreds of design variables, and with multiple Pareto fronts. HMGE efficiency is 2-10
times higher when compared to the most advanced commercial GAs.
The assignment problem is a special type of linear programming problem and it is sub class of transportation problem. Assignment problems are defined with two sets of inputs i.e. set of resources and set of demands. Hungarian algorithm is able to solve assignment problems with precisely defined demands and resources.Nowadays, many organizations and competition companies consider markets of their products. They use many salespersons to improve their organizations marketing. Salespersons travel form one city to another city for their markets. There are some problems in travelling which salespeople should go which city in minimum cost. So, travelling assignment problem is a main process for many business functions. Mie Mie Aung | Yin Yin Cho | Khin Htay | Khin Soe Myint "Minimization of Assignment Problems" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26712.pdfPaper URL: https://www.ijtsrd.com/computer-science/other/26712/minimization-of-assignment-problems/mie-mie-aung
A review of literature shows that there is a variety of works studying coverage path planning in several autonomous robotic applications. In this work, we propose a new approach using Markov Decision Process to plan an optimum path to reach the general goal of exploring an unknown environment containing buried mines. This approach, called Goals to Goals Area Coverage on-line Algorithm, is based on a decomposition of the state space into smaller regions whose states are considered as goals with the same reward value, the reward value is decremented from one region to another according to the desired search mode. The numerical simulations show that our approach is promising for minimizing the necessary cost-energy to cover the entire area.
Comparison of Emergency Medical Services Delivery Performance using Maximal C...IJECEIAES
Ambulance location is one of the critical factors that determine the efficiency of emergency medical services delivery. Maximal Covering Location Problem is one of the widely used ambulance location models. However, its coverage function is considered unrealistic because of its ability to abruptly change from fully covered to uncovered. On the contrary, Gradual Cover Location Problem coverage is considered more realistic compared to Maximal Cover Location Problem because the coverage decreases over distance. This paper examines the delivery of Emergency Medical Services under the models of Maximal Covering Location Problem and Gradual Cover Location Problem. The results show that the latter model is superior, especially when the Maximal Covering Location Problem has been deemed fully covered.
A COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTIONIJCSEA Journal
Stock market price index prediction is a challenging task for investors and scholars. Artificial neural networks have been widely employed to predict financial stock market levels thanks to their ability to model nonlinear functions. The accuracy of backpropagation neural networks trained with different heuristic and numerical algorithms is measured for comparison purpose. It is found that numerical algorithm outperform heuristic techniques.
In this work, we propose to apply trust region optimization to deep reinforcement
learning using a recently proposed Kronecker-factored approximation to
the curvature. We extend the framework of natural policy gradient and propose
to optimize both the actor and the critic using Kronecker-factored approximate
curvature (K-FAC) with trust region; hence we call our method Actor Critic using
Kronecker-Factored Trust Region (ACKTR). To the best of our knowledge, this
is the first scalable trust region natural gradient method for actor-critic methods.
It is also a method that learns non-trivial tasks in continuous control as well as
discrete control policies directly from raw pixel inputs. We tested our approach
across discrete domains in Atari games as well as continuous domains in the MuJoCo
environment. With the proposed methods, we are able to achieve higher
rewards and a 2- to 3-fold improvement in sample efficiency on average, compared
to previous state-of-the-art on-policy actor-critic methods. Code is available at
https://github.com/openai/baselines.
Hybrid Multi-Gradient Explorer Algorithm for Global Multi-Objective OptimizationeArtius, Inc.
Hybrid Multi-Gradient Explorer (HMGE) algorithm for global multi-objective
optimization of objective functions considered in a multi-dimensional domain is presented. The proposed hybrid algorithm relies on genetic variation operators for creating new solutions, but in addition to a standard random mutation operator, HMGE
uses a gradient mutation operator, which improves convergence. Thus, random mutation helps find global Pareto frontier, and gradient mutation improves convergence to the
Pareto frontier. In such a way HMGE algorithm combines advantages of both
gradient-based and GA-based optimization techniques: it is as fast as a pure gradient-based MGE algorithm, and is able to find the global Pareto frontier similar to genetic algorithms
(GA). HMGE employs Dynamically Dimensioned Response Surface Method (DDRSM) for calculating gradients. DDRSM dynamically recognizes the most significant design variables, and builds local approximations based only on the variables. This allows one to
estimate gradients by the price of 4-5 model evaluations without significant loss of accuracy. As a result, HMGE efficiently optimizes highly non-linear models with dozens and hundreds of design variables, and with multiple Pareto fronts. HMGE efficiency is 2-10
times higher when compared to the most advanced commercial GAs.
The assignment problem is a special type of linear programming problem and it is sub class of transportation problem. Assignment problems are defined with two sets of inputs i.e. set of resources and set of demands. Hungarian algorithm is able to solve assignment problems with precisely defined demands and resources.Nowadays, many organizations and competition companies consider markets of their products. They use many salespersons to improve their organizations marketing. Salespersons travel form one city to another city for their markets. There are some problems in travelling which salespeople should go which city in minimum cost. So, travelling assignment problem is a main process for many business functions. Mie Mie Aung | Yin Yin Cho | Khin Htay | Khin Soe Myint "Minimization of Assignment Problems" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26712.pdfPaper URL: https://www.ijtsrd.com/computer-science/other/26712/minimization-of-assignment-problems/mie-mie-aung
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Grasp approach to rcpsp with min max robustness objectivecsandit
This paper deals with the Resource-Constrained Project scheduling Problem (RCPSP) under
activity duration uncertainty. Based on scenarios, the object is to minimize the worst-case
performance among a set of initial scenarios which is referred to as the min-max robustness
objective. Due to the complexity of the tackled problem, we propose the application of the
GRASP method which is qualified as a simple and effective multi-start metaheuristic. The
proposed approach incorporates an adaptive greedy function based on priority rules to
construct new solutions, and a local search with a forward-backward heuristic in the
improvement phase. Two different benchmark data sets are investigated, the Patterson set and
the PSPLIB J30 set. Comparative results show that the proposed enhanced GRASP outperforms
the basic procedure in robustness optimization.
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGcscpconf
Data clustering is a process of arranging similar data into groups. A clustering algorithm
partitions a data set into several groups such that the similarity within a group is better than
among groups. In this paper a hybrid clustering algorithm based on K-mean and K-harmonic
mean (KHM) is described. The proposed algorithm is tested on five different datasets. The research is focused on fast and accurate clustering. Its performance is compared with the traditional K-means & KHM algorithm. The result obtained from proposed hybrid algorithm is much better than the traditional K-mean & KHM algorithm
The asynchronous parallel algorithms are developed to solve massive optimization problems in a distributed data system, which can be run in parallel on multiple nodes with little or no synchronization. Recently they have been successfully implemented to solve a range of difficult problems in practice. However, the existing theories are mostly based on fairly restrictive assumptions on the delays, and cannot explain the convergence and speedup properties of such algorithms. In this talk we will give an overview on distributed optimization, and discuss some new theoretical results on the convergence of asynchronous parallel stochastic gradient algorithm with unbounded delays. Simulated and real data will be used to demonstrate the practical implication of these theoretical results.
A Non Parametric Estimation Based Underwater Target ClassifierCSCJournals
Underwater noise sources constitute a prominent class of input signal in most underwater signal processing systems. The problem of identification of noise sources in the ocean is of great importance because of its numerous practical applications. In this paper, a methodology is presented for the detection and identification of underwater targets and noise sources based on non parametric indicators. The proposed system utilizes Cepstral coefficient analysis and the Kruskal-Wallis H statistic along with other statistical indicators like F-test statistic for the effective detection and classification of noise sources in the ocean. Simulation results for typical underwater noise data and the set of identified underwater targets are also presented in this paper.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Heuristic based adaptive step size clms algorithms for smart antennascsandit
A smart antenna system combines multiple antenna elements with a signal processing capability to
optimize its radiation and/or reception pattern automatically in response to the signal environment through
complex weight selection. The weight selection process to get suitable Array factor with low Half Power
Beam Width (HPBW) and Side Lobe Level (SLL) is a complex method. The aim of this task is to design a
new approach for smart antennas to minimize the noise and interference effects from external sources with
least number of iterations. This paper presents Heuristics based adaptive step size Complex Least Mean
Square (CLMS) model for Smart Antennas to speedup convergence. In this process Benveniste and
Mathews algorithms are used as heuristics with CLMS and the improvement of performance of Smart
Antenna System in terms of convergence rate and array factor are discussed and compared with the
performance of CLMS and Augmented CLMS (ACLMS) algorithms.
Proposing a New Job Scheduling Algorithm in Grid Environment Using a Combinat...Editor IJCATR
Grid computing is a hardware and software infrastructure and provides affordable, sustainable, and reliable access. Its aim is
to create a supercomputer using free resources. One of the challenges to the Grid computing is scheduling problem which is regarded
as a tough issue. Since scheduling problem is a non-deterministic issue in the Grid, deterministic algorithms cannot be used to improve
scheduling. In this paper, a combination of imperialist competition algorithm (ICA) and gravitational attraction is used for to address the
problem of independent task scheduling in a grid environment, with the aim of reducing the makespan and energy. Experimental results
compare ICA with other algorithms and illustrate that ICA finds a shorter makespan and energy relative to the others. Moreover, it
converges quickly, finding its optimum solution in less time than the other algorithms.
A Novel Forecasting Based on Automatic-optimized Fuzzy Time SeriesTELKOMNIKA JOURNAL
In this paper, we propose a new method for forecasting based on automatic-optimized fuzzy time
series to forecast Indonesia Inflation Rate (IIR). First, we propose the forecasting model of two-factor highorder
fuzzy-trend logical relationships groups (THFLGs) for predicting the IIR. Second, we propose the
interval optimization using automatic clustering and particle swarm optimization (ACPSO) to optimize the
interval of main factor IIR and secondary factor SF, where SF = {Customer Price Index (CPI), the Bank of
Indonesia (BI) Rate, Rupiah Indonesia /US Dollar (IDR/USD) Exchange rate, Money Supply}. The
proposed method gets lower root mean square error (RMSE) than previous methods.
Evolution of regenerative Ca-ion wave-packet in Neuronal-firing Fields: Quant...AM Publications,India
The author previously developed a numerical multivariate path-integral algorithm, PATHINT, which has been applied to several classical physics systems, including statistical mechanics of neocortical interactions, options in financial markets, and other nonlinear systems including chaotic systems. A new quantum version, qPATHINT, has the ability to take into account nonlinear and time-dependent modifications of an evolving system. qPATHINT is shown to be useful to study some aspects of serial changes to systems. Applications ranging from regenerative waves in neuroscience to options on blockchains in financial markets are discussed.
Comparative study of optimization algorithms on convolutional network for aut...IJECEIAES
The last 10 years have been the decade of autonomous vehicles. Advances in intelligent sensors and control schemes have shown the possibility of real applications.
Deep learning, and in particular convolutional networks have become a fundamental
tool in the solution of problems related to environment identification, path planning,
vehicle behavior, and motion control. In this paper, we perform a comparative study of
the most used optimization strategies on the convolutional architecture residual neural network (ResNet) for an autonomous driving problem as a previous step to the
development of an intelligent sensor. This sensor, part of our research in reactive
systems for autonomous vehicles, aims to become a system for direct mapping of sensory information to control actions from real-time images of the environment. The
optimization techniques analyzed include stochastic gradient descent (SGD), adaptive gradient (Adagrad), adaptive learning rate (Adadelta), root mean square propagation (RMSProp), Adamax, adaptive moment estimation (Adam), nesterov-accelerated
adaptive moment estimation (Nadam), and follow the regularized leader (Ftrl). The
training of the deep model is evaluated in terms of convergence, accuracy, recall, and
F1-score metrics. Preliminary results show a better performance of the deep network
when using the SGD function as an optimizer, while the Ftrl function presents the
poorest performances.
Level set methods are a popular way to solve the image segmentation problem. The solution contour is found by solving an optimization problem where a cost functional is minimized. The purpose of this paper is the level set approach to simultaneous tissue segmentation and bias correction of Magnetic Resonance Imaging (MRI) images. A modified level set approach to joint segmentation and bias correction of images with intensity in homogeneity. A sliding window is used to transform the gradient intensity domain to another domain, where the distribution overlap between different tissues is significantly suppressed. Tissue segmentation and bias correction are simultaneously achieved via a multiphase level set evolution process. The proposed methods are very robust to initialization, and are directly compatible with any type of level set implementation. Experiments on images of various modalities demonstrated the superior performance over state-of-the-art methods.
Memory Polynomial Based Adaptive Digital PredistorterIJERA Editor
Digital predistortion (DPD) is a baseband signal processing technique that corrects for impairments in RF
power amplifiers (PAs). These impairments cause out-of-band emissions or spectral regrowth and in-band
distortion, which correlate with an increased bit error rate (BER). Wideband signals with a high peak-to-average
ratio, are more susceptible to these unwanted effects. So to reduce these impairments, this paper proposes the
modeling of the digital predistortion for the power amplifier using GSA algorithm.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Grasp approach to rcpsp with min max robustness objectivecsandit
This paper deals with the Resource-Constrained Project scheduling Problem (RCPSP) under
activity duration uncertainty. Based on scenarios, the object is to minimize the worst-case
performance among a set of initial scenarios which is referred to as the min-max robustness
objective. Due to the complexity of the tackled problem, we propose the application of the
GRASP method which is qualified as a simple and effective multi-start metaheuristic. The
proposed approach incorporates an adaptive greedy function based on priority rules to
construct new solutions, and a local search with a forward-backward heuristic in the
improvement phase. Two different benchmark data sets are investigated, the Patterson set and
the PSPLIB J30 set. Comparative results show that the proposed enhanced GRASP outperforms
the basic procedure in robustness optimization.
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGcscpconf
Data clustering is a process of arranging similar data into groups. A clustering algorithm
partitions a data set into several groups such that the similarity within a group is better than
among groups. In this paper a hybrid clustering algorithm based on K-mean and K-harmonic
mean (KHM) is described. The proposed algorithm is tested on five different datasets. The research is focused on fast and accurate clustering. Its performance is compared with the traditional K-means & KHM algorithm. The result obtained from proposed hybrid algorithm is much better than the traditional K-mean & KHM algorithm
The asynchronous parallel algorithms are developed to solve massive optimization problems in a distributed data system, which can be run in parallel on multiple nodes with little or no synchronization. Recently they have been successfully implemented to solve a range of difficult problems in practice. However, the existing theories are mostly based on fairly restrictive assumptions on the delays, and cannot explain the convergence and speedup properties of such algorithms. In this talk we will give an overview on distributed optimization, and discuss some new theoretical results on the convergence of asynchronous parallel stochastic gradient algorithm with unbounded delays. Simulated and real data will be used to demonstrate the practical implication of these theoretical results.
A Non Parametric Estimation Based Underwater Target ClassifierCSCJournals
Underwater noise sources constitute a prominent class of input signal in most underwater signal processing systems. The problem of identification of noise sources in the ocean is of great importance because of its numerous practical applications. In this paper, a methodology is presented for the detection and identification of underwater targets and noise sources based on non parametric indicators. The proposed system utilizes Cepstral coefficient analysis and the Kruskal-Wallis H statistic along with other statistical indicators like F-test statistic for the effective detection and classification of noise sources in the ocean. Simulation results for typical underwater noise data and the set of identified underwater targets are also presented in this paper.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Heuristic based adaptive step size clms algorithms for smart antennascsandit
A smart antenna system combines multiple antenna elements with a signal processing capability to
optimize its radiation and/or reception pattern automatically in response to the signal environment through
complex weight selection. The weight selection process to get suitable Array factor with low Half Power
Beam Width (HPBW) and Side Lobe Level (SLL) is a complex method. The aim of this task is to design a
new approach for smart antennas to minimize the noise and interference effects from external sources with
least number of iterations. This paper presents Heuristics based adaptive step size Complex Least Mean
Square (CLMS) model for Smart Antennas to speedup convergence. In this process Benveniste and
Mathews algorithms are used as heuristics with CLMS and the improvement of performance of Smart
Antenna System in terms of convergence rate and array factor are discussed and compared with the
performance of CLMS and Augmented CLMS (ACLMS) algorithms.
Proposing a New Job Scheduling Algorithm in Grid Environment Using a Combinat...Editor IJCATR
Grid computing is a hardware and software infrastructure and provides affordable, sustainable, and reliable access. Its aim is
to create a supercomputer using free resources. One of the challenges to the Grid computing is scheduling problem which is regarded
as a tough issue. Since scheduling problem is a non-deterministic issue in the Grid, deterministic algorithms cannot be used to improve
scheduling. In this paper, a combination of imperialist competition algorithm (ICA) and gravitational attraction is used for to address the
problem of independent task scheduling in a grid environment, with the aim of reducing the makespan and energy. Experimental results
compare ICA with other algorithms and illustrate that ICA finds a shorter makespan and energy relative to the others. Moreover, it
converges quickly, finding its optimum solution in less time than the other algorithms.
A Novel Forecasting Based on Automatic-optimized Fuzzy Time SeriesTELKOMNIKA JOURNAL
In this paper, we propose a new method for forecasting based on automatic-optimized fuzzy time
series to forecast Indonesia Inflation Rate (IIR). First, we propose the forecasting model of two-factor highorder
fuzzy-trend logical relationships groups (THFLGs) for predicting the IIR. Second, we propose the
interval optimization using automatic clustering and particle swarm optimization (ACPSO) to optimize the
interval of main factor IIR and secondary factor SF, where SF = {Customer Price Index (CPI), the Bank of
Indonesia (BI) Rate, Rupiah Indonesia /US Dollar (IDR/USD) Exchange rate, Money Supply}. The
proposed method gets lower root mean square error (RMSE) than previous methods.
Evolution of regenerative Ca-ion wave-packet in Neuronal-firing Fields: Quant...AM Publications,India
The author previously developed a numerical multivariate path-integral algorithm, PATHINT, which has been applied to several classical physics systems, including statistical mechanics of neocortical interactions, options in financial markets, and other nonlinear systems including chaotic systems. A new quantum version, qPATHINT, has the ability to take into account nonlinear and time-dependent modifications of an evolving system. qPATHINT is shown to be useful to study some aspects of serial changes to systems. Applications ranging from regenerative waves in neuroscience to options on blockchains in financial markets are discussed.
Comparative study of optimization algorithms on convolutional network for aut...IJECEIAES
The last 10 years have been the decade of autonomous vehicles. Advances in intelligent sensors and control schemes have shown the possibility of real applications.
Deep learning, and in particular convolutional networks have become a fundamental
tool in the solution of problems related to environment identification, path planning,
vehicle behavior, and motion control. In this paper, we perform a comparative study of
the most used optimization strategies on the convolutional architecture residual neural network (ResNet) for an autonomous driving problem as a previous step to the
development of an intelligent sensor. This sensor, part of our research in reactive
systems for autonomous vehicles, aims to become a system for direct mapping of sensory information to control actions from real-time images of the environment. The
optimization techniques analyzed include stochastic gradient descent (SGD), adaptive gradient (Adagrad), adaptive learning rate (Adadelta), root mean square propagation (RMSProp), Adamax, adaptive moment estimation (Adam), nesterov-accelerated
adaptive moment estimation (Nadam), and follow the regularized leader (Ftrl). The
training of the deep model is evaluated in terms of convergence, accuracy, recall, and
F1-score metrics. Preliminary results show a better performance of the deep network
when using the SGD function as an optimizer, while the Ftrl function presents the
poorest performances.
Level set methods are a popular way to solve the image segmentation problem. The solution contour is found by solving an optimization problem where a cost functional is minimized. The purpose of this paper is the level set approach to simultaneous tissue segmentation and bias correction of Magnetic Resonance Imaging (MRI) images. A modified level set approach to joint segmentation and bias correction of images with intensity in homogeneity. A sliding window is used to transform the gradient intensity domain to another domain, where the distribution overlap between different tissues is significantly suppressed. Tissue segmentation and bias correction are simultaneously achieved via a multiphase level set evolution process. The proposed methods are very robust to initialization, and are directly compatible with any type of level set implementation. Experiments on images of various modalities demonstrated the superior performance over state-of-the-art methods.
Memory Polynomial Based Adaptive Digital PredistorterIJERA Editor
Digital predistortion (DPD) is a baseband signal processing technique that corrects for impairments in RF
power amplifiers (PAs). These impairments cause out-of-band emissions or spectral regrowth and in-band
distortion, which correlate with an increased bit error rate (BER). Wideband signals with a high peak-to-average
ratio, are more susceptible to these unwanted effects. So to reduce these impairments, this paper proposes the
modeling of the digital predistortion for the power amplifier using GSA algorithm.
EM algorithm is a common algorithm in data mining techniques. With the idea of using two iterations of E and M, the algorithm creates a model that can assign class labels to data points. In addition, EM not only optimizes the parameters of the model but also can predict device data during the iteration. Therefore, the paper focuses on researching and improving the EM algorithm to suit the LiDAR point cloud classification. Based on the idea of breaking point cloud and using the scheduling parameter for step E to help the algorithm converge faster with a shorter run time. The proposed algorithm is tested with measurement data set in Nghe An province, Vietnam for more than 92% accuracy and has faster runtime than the original EM algorithm.
LIDAR POINT CLOUD CLASSIFICATION USING EXPECTATION MAXIMIZATION ALGORITHMijnlc
EM algorithm is a common algorithm in data mining techniques. With the idea of using two iterations of E and M, the algorithm creates a model that can assign class labels to data points. In addition, EM not only optimizes the parameters of the model but also can predict device data during the iteration. Therefore, the paper focuses on researching and improving the EM algorithm to suit the LiDAR point cloud classification. Based on the idea of breaking point cloud and using the scheduling parameter for step E to help the algorithm converge faster with a shorter run time. The proposed algorithm is tested with measurement data set in Nghe An province, Vietnam for more than 92% accuracy and has faster runtime than the original EM algorithm.
The pertinent single-attribute-based classifier for small datasets classific...IJECEIAES
Classifying a dataset using machine learning algorithms can be a big challenge when the target is a small dataset. The OneR classifier can be used for such cases due to its simplicity and efficiency. In this paper, we revealed the power of a single attribute by introducing the pertinent single-attributebased-heterogeneity-ratio classifier (SAB-HR) that used a pertinent attribute to classify small datasets. The SAB-HR’s used feature selection method, which used the Heterogeneity-Ratio (H-Ratio) measure to identify the most homogeneous attribute among the other attributes in the set. Our empirical results on 12 benchmark datasets from a UCI machine learning repository showed that the SAB-HR classifier significantly outperformed the classical OneR classifier for small datasets. In addition, using the H-Ratio as a feature selection criterion for selecting the single attribute was more effectual than other traditional criteria, such as Information Gain (IG) and Gain Ratio (GR).
Validation Study of Dimensionality Reduction Impact on Breast Cancer Classifi...ijcsit
A fundamental problem in machine learning is identifying the most representative subset of features from
which we can construct a predictive model for a classification task. This paper aims to present a validation
study of dimensionality reduction effect on the classification accuracy of mammographic images. The
studied dimensionality reduction methods were: locality-preserving projection (LPP), locally linear
embedding (LLE), Isometric Mapping (ISOMAP) and spectral regression (SR). We have achieved high
rates of classifications. In some combinations the classification rate was 100%. But in most of the cases the
classification rate is about 95%. It was also found that the classification rate increases with the size of the
reduced space and the optimal value of space dimension is 60. We proceeded to validate the obtained
results by measuring some validation indices such as: Xie-Beni index, Dun index and Alternative Dun
index. The measurement of these indices confirms that the optimal value of reduced space dimension is
d=60.
Data imputing uses to posit missing data values, as missing data have a negative effect on the computation validity of models. This study develops a genetic algorithm (GA) to optimize imputing for missing cost data of fans used in road tunnels by the Swedish Transport Administration (Trafikverket). GA uses to impute the missing cost data using an optimized valid data period. The results show highly correlated data (R- squared 0.99) after imputing the missing data. Therefore, GA provides a wide search space to optimize imputing and create complete data. The complete data can be used for forecasting and life cycle cost analysis. Ritesh Kumar Pandey | Dr Asha Ambhaikar"Data Imputation by Soft Computing" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-4 , June 2018, URL: http://www.ijtsrd.com/papers/ijtsrd14112.pdf http://www.ijtsrd.com/computer-science/real-time-computing/14112/data-imputation-by-soft-computing/ritesh-kumar-pandey
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...csandit
Computational Grid (CG) creates a large heterogeneous and distributed paradigm to manage and execute the applications which are computationally intensive. In grid scheduling tasks are assigned to the proper processors in the grid system to for its execution by considering the execution policy and the optimization objectives. In this paper, makespan and the faulttolerance of the computational nodes of the grid which are the two important parameters for the task execution, are considered and tried to optimize it. As the grid scheduling is considered to be NP-Hard, so a meta-heuristics evolutionary based techniques are often used to find a solution for this. We have proposed a NSGA II for this purpose. The performance estimation ofthe proposed Fault tolerance Aware NSGA II (FTNSGA II) has been done by writing program in Matlab. The simulation results evaluates the performance of the all proposed algorithm and the results of proposed model is compared with existing model Min-Min and Max-Min algorithm which proves effectiveness of the model.
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...IJCSEA Journal
A hybrid learning automata–genetic algorithm (HLGA) is proposed to solve QoS routing optimization problem of next generation networks. The algorithm complements the advantages of the learning Automato Algorithm(LA) and Genetic Algorithm(GA). It firstly uses the good global search capability of LA to generate initial population needed by GA, then it uses GA to improve the Quality of Service(QoS) and acquiring the optimization tree through new algorithms for crossover and mutation operators which are an NP–Complete problem. In the proposed algorithm, the connectivity matrix of edges is used for genotype representation. Some novel heuristics are also proposed for mutation, crossover, and creation of random individuals. We evaluate the performance and efficiency of the proposed HLGA-based algorithm in comparison with other existing heuristic and GA-based algorithms by the result of simulation. Simulation results demonstrate that this paper proposed algorithm not only has the fast calculating speed and high accuracy but also can improve the efficiency in Next Generation Networks QoS routing. The proposed algorithm has overcome all of the previous algorithms in the literature..
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION ijscai
Generalization error of classifier can be reduced by larger margin of separating hyperplane. The proposed classification algorithm implements margin in classical perceptron algorithm, to reduce generalized errors by maximizing margin of separating hyperplane. Algorithm uses the same updation rule with the perceptron, to converge in a finite number of updates to solutions, possessing any desirable fraction of the margin. This solution is again optimized to get maximum possible margin. The algorithm can process linear, non-linear and multi class problems. Experimental results place the proposed classifier equivalent to the support vector machine and even better in some cases. Some preliminary experimental results are briefly discussed.
Scalability has been an essential factor for any kind of computational algorithm while considering its performance. In this Big Data era, gathering of large amounts of data is becoming easy. Data analysis on Big Data is not feasible using the existing Machine Learning (ML) algorithms and it perceives them to perform poorly. This is due to the fact that the computational logic for these algorithms is previously designed in sequential way. MapReduce becomes the solution for handling billions of data efficiently. In this report we discuss the basic building block for the computations behind ML algorithms, two different attempts to parallelize machine learning algorithms using MapReduce and a brief description on the overhead in parallelization of ML algorithms.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
GRADIENT OMISSIVE DESCENT IS A MINIMIZATION ALGORITHM
1. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.8, No.1, February 2019
DOI :10.5121/ijscai.2019.8103 37
GRADIENT OMISSIVE DESCENT
IS A MINIMIZATION ALGORITHM
Gustavo A. Lado and Enrique C. Segura
Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires Cyprus
ABSTRACT
This article presents a promising new gradient-based backpropagation algorithm for multi-layer
feedforward networks. The method requires no manual selection of global hyperparameters and is capable
of dynamic local adaptations using only first-order information at a low computational cost. Its semi-
stochastic nature makes it fit for mini-batch training and robust to different architecture choices and data
distributions. Experimental evidence shows that the proposed algorithm improves training in terms of both
convergence rate and speed as compared with other well known techniques.
KEYWORDS
Neural Networks, Backpropagation, Cost Function Minimization
1. INTRODUCTION
1.1. MOTIVATION
When minimizing a highly non-convex cost function by means of some gradient descent strategy,
several problems may arise. If we consider a single direction in the search space, there could
appear local minima [5], where it is necessary first to climb up in order to have the possibility to
reach a better or global minimum, or plateaus [16], where partial derivatives have very small
values even though being far from a minimum. Of course, in a multidimensional space these
problems can combine to produce other ones, such as partial minima (ravines) [1], where for
some direction (subspace) there exists a local minimum and for other direction there is a plateau,
and the saddle points [3], where partial derivatives have small values since a partial minimum
appears in a direction and a maximum appears in another direction. Even if considering a single
search direction it can be easily seen that an algorithm using only gradient information might be
in trouble when looking for a good minimum.
1.2. BACKGROUND
Approaches such as Stochastic Gradient Descent (SGD) [15] or Simulated Annealing (SA) [14, 7]
introduce a random factor in the search which permits, to some extent, overcome the problem of
local minima, whilst other methods as Gradient Descent with Momentum (GDM) [11] or adaptive
parameters [2] allow for speeding up the search in the case of plateaus. When combined with
incremental or mini-batch training, these techniques can be effective to some degree even with
ravines and saddle points, but they use global hyperparameters that are generally problem-
dependent.
2. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.8, No.1, February 2019
38
Other techniques such as Conjugate Gradient Descent (CGD) [12], or others which incorporate
second derivative information, as the Newton Method and the Levenberg-Marquardt algorithm
[17], can speed up the search through a better analysis of the cost function, but are in general
computationally expensive and highly complex, turning more difficult its implementation and
use.
Finally, more recent algorithms such as Quick Back-Propagation (QBP) [17], Resilient Back-
Propagation (RBP) [13] and Adaptive Momentum Estimation (AME) [19, 4] use adaptive
parameters independent for each search direction, hence they do not require selection of global
hyperparameters and can adapt its strategy as to cope with different kinds of problems. However,
even in these situations, they can get stuck in bad solutions (poor minima). In addition, all these
techniques require batch training, which in many cases may be not only inconvenient but
impossible or highly impractical.
The main difficulty in the search of a good minimum (global or not) is that these algorithms try to
take good decisions solely on the base of local information and values that, in many cases, are the
result of estimations of estimations. In other words: it is difficult to take intelligent decisions with
incomplete and inaccurate information.
Perhaps a better approach would be just to try that by the end of each epoch the probability of
cost decrease be greater than that of increase and, in the case that it increased, it have a reasonably
high probability of escaping to another basin of the cost surface. Thus, instead of trying to ensure
convergence through analysis of the cost function, the matter is to warrant statistical tendency to a
good minimum.
2. THE PROPOSED ALGORITHM
Our method borrows in part certain ideas from other techniques. For example, from SGD the
introduction of a random element in the descent; from CGD the exploration of the cost function -
though nonlinear in this case- and from RBP the use of the sign of the gradient as indicator of the
search direction, omitting its magnitude [12,13, 14, 15, 17]. The main idea is that during training
the search moves across the cost surface with random steps within a radius of search, so as to
statistically converge to the minimum; and this radius decreases as the solution is approached.
2.1. DESCRIPTION
In each epoch the dataset is divided into mini-batches. A partial estimation of the cost function E
is made for each one of those mini-batches, and the gradient is computed by standard error
backpropagation. Then weights are modified in a random magnitude, bounded by a value R, but
using the direction (sign) of the △W previously computed. At the end of each epoch, the cost is
compared to that of earlier steps and, should it correspond, necessary adjustments are made on R.
If we look at each mini-batch as a sample of the dataset, we can consider each epoch as a
Montecarlo estimation of the cost function. Similarly, the fact that the search radius gradually
decreases suggests a resemblance to simulated annealing [10]. However, to impose a fixed
reduction of R as a function of the number of epochs or of the cost is impractical due to the
variability between different problems. For this reason a strategy was chosen where adjustments
to R are dynamically applied, depending on the relative cost [9]. This turns it more robust, but the
adjustments must be small, in the order of 1%, since by the very nature of the estimation of E,
oscillations might occur.
3. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.8, No.1, February 2019
39
Then the basic idea for modifying R can be expressed as follows: if the cost increased with regard
to the last epoch, reduce the radius of search. This is relatively effective in a variety of scenarios,
but these results can still be improved by using some local parameters instead of global ones [18].
For example, at the end of each epoch an estimation can be made of the individual contribution of
each unit to the total cost, that is, the cost by unit (Eu). Similarly, instead of using a global radius
of search R, we can have an individual radius (Ru) for each unit.
Under these conditions the rule for adjustment of search radii is modified as follows: if the total
cost increased, reduce search radius for units with higher error and increase radius for the others
(see Algorithm 1). Then, in this strategy the radius of search can be seen as the level of resolution
of the cost function. If this function cannot decrease any more, the radius is reduced in order to
look at the cost surface in more detail [2]. Increasing the radius for those units whose cost grew
up helps to maintain an equilibrium and possibly to escape from a bad solution.
Algorithm 1. The Gradient Omissive Descent algorithm.
initialize W, R, E, t
while (E > min_error)∧(t < max_epoch):
E[t] ← 0
for every batch b:
W ← W + U(0, R)|W| × sgn(∂E(W, b)∂W)
E[t] ← E[t] + E(W, b)
dE ← E[t] − E[t − 1]
if dE > 0:
for every unit u:
if dEu > 0:
Ru ← Ru × 0.99
else:
Ru ← Ru × 1.01
t ← t + 1
3. EXPERIMENTS
3.1. METHODS
In all experiments our technique, Gradient Omissive Descent (GOD), was compared with three
other algorithms: SGD, GDM, RBP. These were chosen in order to cover the alternatives of
incremental training, mini-batch and batch respectively. In future work we will also include other
approaches such as CGD, though the results might not be easy to compare, and AME, for which
we do not have yet a properly tested implementation.
The stop condition in all simulations was either the completion of a maximum number of epochs
or having reached an error under a minimum value. This error was computed as the average of all
the errors in the training dataset. Since data were artificially generated, it was possible to keep the
same dataset size through all the tests.
4. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.8, No.1, February 2019
40
As a measure of the efficacy of the algorithms, and since there exists no unique stop condition,
the quantity epoch⁄max_epoch × error⁄min_error was used, where epoch and error correspond to
final values, and max_epoch and min_error are the values for the stop condition. Thus, lower
values indicate a higher efficacy and, in particular, quantities less than 1 imply that a solution has
been obtained.
Experiments were realized on two kinds of problems, typical for evaluation of this class of
algorithms: parity and auto-encoding. In both cases, additional constraints were introduced in
order to have a better control and thus to adjust the level of difficulty [8]. Final results were
computed as the average of several simulations under the same constraints but with different
datasets.
3.1.1. PARITY
The parity problem, i.e. multivariate XOR, is defined as the result of verifying if the number of
1’s in a list of binary variables is even. To generate data for these experiments we use variables xi
∈ { − 1, 1} instead of binary, defining the function even as:
Where u(x) = 1 if x = 1, and equals to zero otherwise.
In its simplest form, this problem consists of mapping n inputs into a single output which
indicates the parity. What makes it particularly interesting is the fact that changing the value of
just one input variable produces a change in the output. In our experiments we used 10 input
variables [x0..x9] and 10 output variables [z0..z9], where zi = even(x0..xi). That is, each output
corresponds to the parity on a portion of the input variables. This turns the problem more
difficult, as the unique hidden layer must find a representation satisfying simultaneously all
outputs.
3.1.2. AUTO-ENCODING WITH DEPENDENT VARIABLES
One of the critical factors in the efficacy of an auto-encoding is how independent are the data
between each other. If the dataset is built up with completely independent variables and of the
same variance, it is not possible to make an effective reduction of dimension [6].
For this reason, the datasets for the experiments were constructed with different proportions
between independent and dependent variables. Independent variables were drawn from a random
uniform distribution between -1 and 1, and the dependent ones were random linear combinations
of the former. Final values result from applying the sign function in order to work also in this case
with values in { − 1, 1}.
However, in real situations, data are not often equally correlated between them. This fact was also
taken into account, splitting the dataset into several classes. Each class not only uses different
correlations for dependent variables, but also assigns different positions to the independent ones.
5. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.8, No.1, February 2019
41
3.2. RESULTS
This section presents comparisons between the results obtained with the different algorithms
under consideration for the problems described above. For each method, the parameters that
produced the better results were chosen. In all simulations the stop condition was 3000 epochs or
an average error less than 10 − 3. These settings had the purpose to ensure that all the algorithms
had the possibility to converge.
The weights were independently initialized at random from an uniform distribution over the
interval { − N − 12, N − 12}, being N the number of neurons in the previous layer. Initial values for
R were computed as 1
⁄N × M, where N is the number of units in the previous layer and M is the
number of units in the same layer. Each mini-batch, both for GDM and for GOD, consisted of a
10% of the elements of the dataset, chosen at random without repetition.
3.2.1. PARITY
For this group of experiments the network had a single hidden layer of 40 units. The training set
included all the 210 possible combinations for the input. For SGD the learning rate η was 0.01, for
GDM η = 0.001 and the momentum factor β = 0.5. For RBP, η + = 1.2 and η − = 0.5 in all cases,
as proposed in [13].
Figure 1. Training results of several algorithms for multiple parity.
Figure 1 shows the time evolution of the sum of square errors by epoch for different algorithms.
These results correspond to a typical run of training, since the average of several simulations
6. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.8, No.1, February 2019
42
would not show the differences in the behaviour of different methods. Although the inner
stochasticity of the training process introduces large variations in the way of descent for each
algorithm, the order of convergence is markedly consistent. It can be noted that the cost decreases
much more rapidly with the proposed method.
3.2.2. AUTO-ENCODING WITH DEPENDENT VARIABLES
A network with 100 input and 100 output neurons was used, while the size of the hidden layer
was varied between 70, 80 and 90 units. The training set had 1000 instances, from which 25%
were set apart for validation. In view of the number of simulations under the same conditions, the
validation method used was holdout, since in these circumstances there is no need for more
partitions of cross-validation. The parameters were: for SGD η = 0.001; for GDM η = 0.001, β =
0.5.
Figure 2 shows the variations in efficacy of different algorithms as changing the proportion of
independent variables and the number of units in the hidden layer. The three parts of the figure
correspond to the variation of number of units in the hidden layer, indicated on the left axis (a
smaller value corresponds to a more difficult problem); percentages on the horizontal axis show
the proportion of dependent variables (a smaller value also corresponds to a more difficult
problem), and values on vertical axes indicate the efficacy of the algorithms computed as
previously defined and averaged over several runs. In these simulations, 5 classes were included;
results for other number of classes are not displayed here since they show a not quite pronounced
variation, but consistent with the problem complexity.
Figure 2. Comparison of the performance of different algorithms
in problems of auto-encoding with correlated data.
7. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.8, No.1, February 2019
43
It is worth noting not only how the efficacy of the proposed technique varies with problems of
different complexity, but also how varies the relative behaviour of all compared methods.
Depending on the complexity of the problem, the quality of our solution relatively changes with
regard to other algorithms, but it can also be seen a clear trend to a faster convergence.
4. CONCLUSIONS
In the present work an optimization technique was introduced with the ability to find minima or
quasi-minima of cost functions for non-convex, highly dimensional problems. The proposed
algorithm manages successfully some classical difficulties reported in the literature, such as local
minima, plateaus, ravines and saddle points. In our view, the method is an advance regarding the
previous ones, from which it also takes inspiration, in the form of an exploration of the cost
function, in a semi-random descent, and omitting the magnitude of the gradient.
The proposed algorithm works efficiently on a wide range of problems, featuring convergence
rate and speed higher than other common methods. We also note that in certain cases, for example
when the problem is very simple, less sophisticated algorithms as SGD can be more effective, but
in its turn have the disadvantage of requiring a parameter setting. Similarly, in problems which
can be intractable -due for example to architecture limitations- our solution is comparable to RPB
but, unlike it, our method can work in mini-batches.
ACKNOWLEDGEMENTS
The idea for this algorithm was fundamentally conceived while L. G. was working with Professor
Verónica Dahl in Vancouver, BC. Special thanks to her for the invitation and to the Emerging
Leaders in the Americas Program (ELAP) of the Canada’s government for making that visit
possible.
REFERENCES
[1] Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, Yann LeCun. The loss
surfaces of multilayer networks. Artificial Intelligence and Statistics:192-204, 2015.
[2] Christian Darken, Joseph Chang, John Moody. Learning rate schedules for faster stochastic
gradient search. Neural Networks for Signal Processing [1992] II., Proceedings of the 1992 IEEE-
SP Workshop:3-12, 1992.
[3] Yann N. Dauphin, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, Yoshua
Bengio. Identifying and attacking the saddle point problem in high-dimensional non-convex
optimization. Advances in neural information processing systems:2933-2941, 2014.
[4] Timothy Dozat. Incorporating nesterov momentum into adam. , 2016.
[5] Marco Gori, Alberto Tesi. On the problem of local minima in backpropagation. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 14(1):76-86, 1992.
[6] Geoffrey E. Hinton, Ruslan R. Salakhutdinov. Reducing the dimensionality of data with neural
networks. science, 313(5786):504-507, 2006.
8. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.8, No.1, February 2019
44
[7] Scott Kirkpatrick, C. Daniel Gelatt, Mario P. Vecchi, others. Optimization by simulated annealing.
science, 220(4598):671-680, 1983.
[8] Steve Lawrence, C. Lee Giles, Ah Chung Tsoi. What size neural network gives optimal
generalization?: Convergence properties of backpropagation. 1998.
[9] George D. Magoulas, Michael N. Vrahatis, George S. Androulakis. Improving the convergence of
the backpropagation algorithm using learning rate adaptation methods. Neural Computation,
11(7):1769-1796, 1999.
[10] Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, Edward
Teller. Equation of state calculations by fast computing machines. The journal of chemical
physics, 21(6):1087-1092, 1953.
[11] Miguel Moreira, Emile Fiesler. Neural networks with adaptive learning rate and momentum terms.
Idiap, 1995.
[12] Martin Fodslette Møller. A scaled conjugate gradient algorithm for fast supervised learning.
Neural networks, 6(4):525-533, 1993.
[13] Martin Riedmiller, Heinrich Braun. A direct adaptive method for faster backpropagation learning:
The RPROP algorithm. Neural Networks, 1993., IEEE International Conference on:586-591,
1993.
[14] Herbert Robbins, Sutton Monro. A stochastic approximation method. The annals of mathematical
statistics:400-407, 1951.
[15] David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams. Learning representations by back-
propagating errors. nature, 323(6088):533, 1986.
[16] Richard S. Sutton. Two problems with backpropagation and other steepest-descent learning
procedures for networks. Proc. 8th annual conf. cognitive science society:823-831, 1986.
[17] Bogdan M. Wilamowski. Neural network architectures and learning. Industrial Technology, 2003
IEEE International Conference on, 1:-1, 2003.
[18] Z. Zainuddin, N. Mahat, Y. Abu Hassan. Improving the Convergence of the Backpropagation
Algorithm Using Local Adaptive Techniques. International Conference on Computational
Intelligence:173-176, 2004.
[19] Matthew D. Zeiler. ADADELTA: an adaptive learning rate method. arXiv preprint
arXiv:1212.5701, 2012.
AUTHORS
Gustavo Lado is a graduated student from the University of Buenos Aires in
Computing Science. His previous research interests include the simulation of the
human visual system with artificial neural networks. Part of this work was recently
published in the book "La Percepción del Hombre y sus Máquinas". He is currently
working in neural networks applied to the natural language processing and cognitive
semantics as part his Ph.D.
9. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.8, No.1, February 2019
45
Enrique Carlos Segura was born in Buenos Aires, Argentina. He received the M.Sc.
degrees in Mathematics and Computer Science in 1988 and the Ph.D. degree in
Mathematics in 1999, all from University of Buenos Aires.
He was Fellow at the CONICET (National Research Council, Argentina) and at the
CONEA (Atomic Energy Agency, Argentina). He had also a Thalmann Fellowship to
work at the Universitat Politécnica de Catalunya (Spain) and a Research Fellowship at
South Bank University (London). From 2003 to 2005 he was Director of the
Department of Computer Science of the University of Buenos Aires, where he is at present a Professor and
Resarcher.His main areas of research are Artificial neural Networks -theory and applications- and, in
general, Cognitive Models of Learning and Memory.