International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 649...
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 649...
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 649...
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 649...
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 649...
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 649...
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 649...
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 649...
Upcoming SlideShare
Loading in …5
×

Training the neural network using levenberg marquardt’s algorithm to optimize

1,170 views
1,133 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,170
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
44
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Training the neural network using levenberg marquardt’s algorithm to optimize

  1. 1. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME93TRAINING THE NEURAL NETWORK USINGLEVENBERG-MARQUARDT’S ALGORITHM TO OPTIMIZE THEEVACUATION TIME IN AN AUTOMOTIVE VACUUM PUMPVijayashree1*, Kolla Bhanu Prakash2and T.V. Ananthan31, 2, 3Department of Computer Science and Engineering, Dr. MGR Educational and Research InstituteUniversity, Maduravoyal, Chennai 600 095, IndiaABSTRACTNeural networks have been used for engine computations in the recent past. One reason for usingneural networks is to capture the accuracy of experimental data while saving computational time, sothat system simulations can be performed within a reasonable time frame. The main aim of this study isto optimize and arrive at a design base for a vacuum pump in an automotive engine usingLevenberg-Marquardt’s (LM) Algorithm for Artificial Neural Networking (ANN). Design bases arecreated based on the previous products and by bench marking. Effortless application of brake is apreferred comfort feature in automotive application. To provide an easy and effective feeling, thebraking mechanism needs to be assisted with external energy. This is optimized based on LM algorithmusing the neural network to arrive at the optimum evacuation time..Index Terms: automotive engine, braking system, evacuation time, Levenberg-Marquardt’s (LM)Algorithm, neural networks, vacuum pump.I. INTRODUCTIONEffortless application of brake is a preferred comfort feature in automotive application. Toprovide an easy and effective feeling, the braking mechanism needs to be assisted with external energy.Vane type Vacuum pump exactly serves this purpose, which is used to produce vacuum by evacuatingthe air in the vacuum booster. This vacuum is used to actuate the booster for the power brakes in thediesel-powered and Gasoline Direct Injection automobile. The capacity of the vacuum pump variesbased on the weight and brake booster capacity of the vehicle. Therefore, it is necessary to have adesign base with a proven technique, which will serve as a basis for faster product development.Neural networks and other machine learning algorithms are increasingly being used for engineapplications [1]. These applications can be categorized as either real time control/diagnostic methodsINTERNATIONAL JOURNAL OF ADVANCED RESEARCH INENGINEERING AND TECHNOLOGY (IJARET)ISSN 0976 - 6480 (Print)ISSN 0976 - 6499 (Online)Volume 4, Issue 3, April 2013, pp. 93-100© IAEME: www.iaeme.com/ijaret.aspJournal Impact Factor (2013): 5.8376 (Calculated by GISI)www.jifactor.comIJARET© I A E M E
  2. 2. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME94or predictive tools for design purposes. Some applications have even moved downstream of the engine[2]. The present work aims to use neutral network technique using LM algorithm to arrive at theappropriate evacuation time which is a critical parameter. The particular task selected is to minimizethe evacuation time in a vane type vacuum pump. The dataset used are the experimental resultsconducted at UCAL Fuel Systems Ltd. Chennai.II. VACUUM PUMPVane type vacuum pump has a unique profile in which an eccentrically mounted rotor rotates thevane as shown in the Fig.1. The movement of vanes creates pressure difference, which creates vacuumin brake booster. Air enters the pump through inlet check valve assembly. Oil is circulated inside thepump to lubricate the rotating parts and to maintain sealing between the high pressure and low pressureregions [3, 4, 5]. The air and oil mixture are then expelled outside the pump through the reed valve. Theperformance of the pump is specified by evacuation time of a specified tank volume [3].Evacuation time, t = (Vt / Q ) / ln (p1 / p2)Where Vt is tank volume; p1 is atmospheric pressure and p2 is required pressure.Vane type vacuum pump is used to produce vacuum by evacuating the air in the vacuum booster.This vacuum is used to actuate the booster for the power brakes in the diesel-powered and GDIautomobile. The capacity of the vacuum pump varies based on the weight and brake booster capacity ofthe vehicle. Therefore, it is necessary to have a design base with a proven technique, which will serve asa basis for faster product development.These results obtained from the existing pump were used for training the ANN using LM algorithmto create the design base for any future design. Figure 1 shows the vacuum pump of capacity 110cc.Fig.1 Photograph of vacuum pump of capacity 110ccIII. LEVENBERG-MARQUARDT’S ALGORITHMThe LM algorithm is an iterative technique that locates a local minimum of a multivariatefunction that is expressed as the sum of squares of several non-linear, real-valued functions. It hasbecome a standard technique for nonlinear least-squares problems, widely adopted in variousdisciplines for dealing with data-fitting applications. LM can be thought of as a combination of steepestdescent and the Gauss-Newton method. When the current solution is far from a local minimum, thealgorithm behaves like a steepest descent method: slow, but guaranteed to converge. When the currentsolution is close to a local minimum, it becomes a Gauss-Newton method and exhibits fastconvergence.Input:A vector function f : Rm→ Rnwith n ≥ m, a measurement vector x ∈∈∈∈ Rnand an initial parametersestimate p0 ∈∈∈∈ Rm.
  3. 3. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME95Output:A vector p+∈∈∈∈ Rmminimizing ||x – f(p) ||2Algorithm:k := 0; v := 2; p := p0;A := JTJ; ∈∈∈∈P := x – f(p); g := JT∈∈∈∈P;stop := (||g||∞ ≤ ∈∈∈∈1); µ := τ * maxi=1, …, m (Aii);while (not stop) and (k < kmax)k := k + 1;repeatSolve (A + µI) δP = g;if (||δP ||≤ ∈∈∈∈2 ||p||)stop := true;elsepnew := p + δP;ρ := (||∈∈∈∈P ||2− ||x – f(pnew)||2) / ( TPδ (µδP + g));if ρ > 0p = pnew;A := JTJ; ∈∈∈∈P := x – f(p); g := JT∈∈∈∈P;stop := (||g||∞ ≤ ∈∈∈∈1);µ := µ * max(1/3, 1 – (2ρ – 1)3); v := 2;elseµ := µ * v; v := 2 * v;endifendifuntil (ρ > 0) or (stop)endwhilep+:= p;The above is Levenberg-Marquardt nonlinear least squares algorithm. ρis the gain ratio, definedby the ratio of the actual reduction in the error (||∈∈∈∈P ||2) that corresponds to a step δP and the reductionpredicted for δP by the linear model of Eq. (1). See text and [6,7] for details. When LM is applied to theproblem, the operation enclosed in the rectangular box is carried out by taking into account the sparsestructure of the corresponding Hessian matrix A.In the following, vectors and arrays appear in boldface and Tis used to denote transposition. Also,||.|| and ||.||∞ respectively denote the 2 and infinity norms. Let f be an assumed functional relation whichmaps a parameter vector p ∈∈∈∈ Rmto an estimated measurement vector x = f(p), x ∈∈∈∈ Rn. An initialparameter estimate p0 and a measured vector x are provided and it is desired to find the vector p+thatbest satisfies the functional relation f locally, i.e. minimizes the squared distance ∈∈∈∈T∈∈∈∈ with ∈∈∈∈ = x - xfor all p within a sphere having a certain, small radius. The basis of the LM algorithm is a linearapproximation to f in the neighborhood of p. Denoting by J the Jacobian matrixpp∂∂ )(f, a Taylor seriesexpansion for a small ||δP|| leads to the following approximation f (p + δP ) ≈ f (p) + J δP (1)Like all non-linear optimization methods, LM is iterative. Initiated at the starting point p0, it produces aseries of vectors p1, p2, … that converge towards a local minimize p+for f. Hence, at each iteration, it isrequired to find the step δP that minimizes the quantity ||x −−−− f (p + δP ) || ≈ ||x −−−− f (p) − J δP || = ||∈∈∈∈−−−− J δP||(2)
  4. 4. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME96The sought δP is thus the solution to a linear least-squares problem: the minimum is attained when JδP −−−− ∈∈∈∈ is orthogonal to the column space of J. This leads to JT(J δP −−−− ∈∈∈∈) = 0, which yields theGauss-Newton step δP; as the solution of the so-called normal equations: JTJ δP = JT∈∈∈∈ (3)Ignoring the second derivative terms, matrix JTJ in Eq.(3) approximates the Hessian of ½∈∈∈∈T∈∈∈∈[18]. Note also that JT∈∈∈∈ is along the steepest descent direction, since the gradient of ½∈∈∈∈T∈∈∈∈ is −JT∈∈∈∈.The LM method actually solves a slight variation of Eq. (3), known as the augmented normal equations:N δP = JT∈∈∈∈, with N ≡≡≡≡ JTJ + µI and µ > 0 (4)Where I is the identity matrix. The strategy of altering the diagonal elements of JTJ is calleddamping and µ is referred to as the damping term. If the updated parameter vector p +δP with δPcomputed from Eq. (4) leads to a reduction in the error ∈∈∈∈T∈∈∈∈, the update is accepted and the processrepeats with a decreased damping term. Otherwise, the damping term is increased, the augmentednormal equations are solved again and the process iterates until a value of δP that decreases the error isfound. The process of repeatedly solving Eq. (4) for different values of the damping term until anacceptable update to the parameter vector is found corresponds to one iteration of the LM algorithm.In LM, the damping term is adjusted at each iteration to assure a reduction in the error. If the dampingis set to a large value, matrix N in Eq. (4) is nearly diagonal and the LM update step δP is near thesteepest descent direction JT∈∈∈∈. Moreover, the magnitude of δP is reduced in this case, ensuring thatexcessively large Gauss-Newton steps are not taken.Damping also handles situations where the Jacobian is rank deficient and JTJ is thereforesingular [4]. The damping term can be chosen so that matrix N in Eq. (4) is nonsingular and, therefore,positive definite, thus ensuring that the δP computed from it is in a descent direction. In this way, LMcan defensively navigate a region of the parameter space in which the model is highly nonlinear. If thedamping is small, the LM step approximates the exact Gauss-Newton step. LM is adaptive because itcontrols its own damping: it raises the damping if a step fails to reduce ∈∈∈∈T∈∈∈∈ otherwise it reduces thedamping. By doing so, LM is capable of alternating between a slow descent approach when being farfrom the minimum and a fast, quadratic convergence when being at the minimum’s neighborhood [8].The LM algorithm terminates when at least one of the following conditions is met:1. The gradient’s magnitude drops below a threshold ε1.2. The relative change in the magnitude of δP drops below a threshold ε2.3. A maximum number of iterations kmax is reached.The complete LM algorithm is shown in the above pseudocode; more details regarding it can befound in [6]. The initial damping factor is chosen equal to the product of a parameter τ with themaximum element of JTJ in the main diagonal. Indicative values for the user-defined parameters are τ =10−3, ε1 = ε2 = 10−2, kmax = 100.IV. METHODOLOGY OF NEURAL NETWORKS IN VACUUM PUMP PERFORMANCEOPTIMIZATIONThe performance of the vacuum pump is determined by time required to evacuate air from thereservoir. This function depends on the various parameters like temperature, oil pressure, rotationspeed etc. The vacuum pump development requires the procedure to develop the pump of any capacitybased on the customer requirement.In this first training stage, the inputs and the desired outputs are given to the NN. The weights aremodified to minimize the error between the NN predictions and expected outputs. Different types oflearning algorithms have been developed, but the most common and robust one is back-propagation.The goal of the training is to minimize the error, and consequently to optimize the NN solution. Eachiterative step in which the weights are recalculated is called epoch. When the minimum is achieved, theweights are fixed and the training process ends. Once a neural network has been trained to a satisfactory
  5. 5. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME97level, it may be used as a predictive tool for new data. To do this, only the inputs are given to the NN,and the NN predicted outputs are calculated using the previous error minimizing weights.V. RESULTS AND DISCUSSIONThe dataset used was obtained from UCAL Fuel Systems Ltd, Chennai. There were 4 sets oftraining data, each set corresponding to a different combination of pump and tank capacity, speed,pressure and evacuation time.There were 21x6 training data points and 4 input features. The target values were the 21x6normalized (by the minimum possible evacuation time) values. There were 10 such sets for testing too.No tuning set was required to be extracted from the training data, since because of the large number oftraining data points, the training error as well as tune error decreased asymptotically, beyond a fewhundred epochs, and early stopping did not occur. The MATLAB neural network toolbox was used tobuild the baseline neural networks. The Levenberg-Marquardt algorithm [9, 10] was used with the backpropagation algorithm. Twenty five hidden layers with an optimal 10 neurons having sigmoidactivation function, and the output layer having a ten neuron with a linear activation function was thechosen configuration. The Nguyen-Widow method was used to initialize the weights. Evacuation timepredictions were made using this configuration (baseline case).The reasons to incorporate a physical model into a neural network are:1. To make the network more robust. Even if confronted with a set of conditions very different fromthose encountered in the training data, the network should output realistic results.2. To reduce dependence on training data, i.e. to enable the network to form a reasonable hypothesis,from small datasets.3. To improve the prediction accuracy.Table 1 Experimental data for tank capacity 100 cc and pump capacity 3 cc.TemperatureSpeed400 1000 1500 230050 3.47 1.97 1.7 1.6190 3.53 1.98 1.8 1.7120 3.92 2.08 1.8 1.75150 4.77 2.16 1.17 1.72Table 2 ANN result for tank capacity 100 cc and pump capacity 3 cc (hidden layers: 25)TemperatureEvacuation timeSpeed400 1000 1500 230050 3.47912 1.7302 1.9189 1.6027390 3.53071 1.98974 1.32223 1.67414120 3.90548 2.18308 0.84523 1.73175150 4.90085 1.78111 2.24074 1.67527The reported error is the mean square error over normalized evacuation time values. It is always thetest error, unless otherwise mentioned. It was noticed from error plots that most of the error occurred
  6. 6. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME98over the -0.2396 region (Fig.2). The other regions had much smaller errors and this error were thereforechosen for comparison with the three new methods.Fig.2 Error histogramThe mean square error of the model output to the target output is a typical measure of neuralnetwork performance. However, it was found that there are practical difficulties in establishingacceptance criteria for the mean square error. Therefore a normalised version of the mean square errorwas implemented. This normalised mean square error used the nearer specification limit concept thatwas modified to encompass the definition of an acceptable percentage error level. Here, the acceptableerror was equated to the typical level of propagated error that one would expect from theinstrumentation measuring the engine performance. This was consistent with the idea that it isreasonable not to expect a higher standard of inference using the model than one could expect fromdirect measurement of the engine performance.The performance obtained during the training arePerformance = 0.1601trainPerformance = 8.4504e-008valPerformance = 0.4123testPerformance = 0.2283During training, the progress is constantly updated in the training window. Of most interest arethe performance, the magnitude of the gradient of performance and the number of validation checks.The magnitude of the gradient and the number of validation checks are used to terminate the training.The gradient will become very small as the training reaches a minimum of the performance. If themagnitude of the gradient is less than 1e-5, the training will stop (Fig.3). This limit can be adjusted bysetting the parameter net.trainParam.min_grad. The number of validation checks represents the numberof successive iterations that the validation performance fails to decrease. If this number reaches 6 (thedefault value), the training will stop.
  7. 7. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME99Fig.3 Gradient plotThe performance plot (Fig.4) shows the value of the performance function versus the iterationnumber (epochs). It plots training, validation and test performances. The best validation performance is0.17081 at epoch1.Fig.4 Performance plotThe training state plot shows the progress of other training variables, such as the gradient magnitude,the number of validation checks, etc (Fig.5). The error histogram plot shows the distribution of thenetwork errors. The regression plot shows a regression between network outputs and network targets.Fig.5 Training regeression plotsThe three axes represent the training, validation and testing data. The dashed line in each axisrepresents the perfect result – outputs = targets. The solid line represents the best fit linear regressionline between outputs and targets. The R value is an indication of the relationship between the outputsand targets. If R = 1, this indicates that there is an exact linear relationship between outputs and targets.
  8. 8. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME100If R is close to zero, then there is no linear relationship between outputs and targets. For this example,the training data indicates a good fit. The validation and test results also show R values that greater than0.9. The scatter plot is helpful in showing that certain data points have poor fits. Here in this R attraining, validation, test and with all the three are 0.083294, 0.13655, 0.80023 and 0.080557respectively.VI. CONCLUSIONFrom the results obtained from the above Levenberg-Marquardt’s algorithm, it can be concludedthat the above algorithm works quite satisfactorily in optimizing the evacuation time in automotiveengines. The above optimization has been validated and found to be accurate to 5% level. The deviationof NN optimized values were also found within 5%, when compared with experimental results.VII. ACKNOWLEDGEMENTI wish to acknowledge Mr. J. Suresh Kumar, Deputy General Manager of UCAL Fuel SystemsLtd, Chennai for his help in conducting the experiments and generating the data set to do this projectand validate the same in their prototype.REFERENCES[1] Indranil Brahma, Yongsheng He and Christopher J. Rutland, Improvement of Neural Network Accuracy forEngine Simulations, SAE Paper 2003-01-3227[2] He, Y. and Rutland, C.J., “Application of Artificial Neural Network for Integration of Advanced EngineSimulation Methods”, Proceedings of the 2000 Fall Technical Conference of the ASME Internal CombustionEngine Division, ICE-Vol.35-1, 53-64, Paper No. 2000-ICE-304, 2000[3] Chambers, A., Fitch, R. K., Halliday, B. S., “Basic Vacuum Technology,” ISBN 0-75-030495-2, 1998.[4] Nagendiran, S., Sivanantham, R., and Kumar, J.,“Improvement of the Performance of Cam-Operated VacuumPump for Multi Jet Diesel Engine,” SAE Technical Paper 2009-01-1462, 2009, doi:10.4271/2009-01-1462.[5] Nagendiran S R, Arun Subramanian, J Suresh kumar and Ramalingam Sivanantham Designing of AutomotiveVacuum Pumps - Development of Mathematical Model for Critical Parameters and Optimization usingArtificial Neural Networks, SAE Paper No.2012-01-0779K. Madsen, H. Nielsen, and O. Tingleff. Methods forNon-Linear Least Squares Problems. Technical University of Denmark, 2004. Lecture notes, available athttp://www.imm.dtu.dk/courses/02611/nllsq.pdf.[6] Manolis I.A. Lourakis and Antonis A. Argyros, Is Levenberg-Marquardt the Most Efficient OptimizationAlgorithm for Implementing Bundle Adjustment? Proceedings of the Tenth IEEE International Conference onComputer Vision (ICCV’05), IEEE Computer Society[7] J. Dennis and R. Schnabel. Numerical Methods for Unconstrained Optimization and Nonlinear Equations.Classics in Applied Mathematics. SIAM Publications, Philadelphia, 1996.[8] Indranil Brahma, Yongsheng He and Christopher J. Rutland, Improvement of Neural Network Accuracy forEngine Simulations SAE Paper 2003-01-3227[9] Hagan, M.T. and Menjaj, M.B., “Training Feedforward Networks with the Marquardt Algorithm”, IEEETransactions on Neural Networks, Vol. 5, No. 6, pp.989-993, 1994.[10] Pallavi.H.Agarwal, Prof.Dr.P.M.George and Prof.Dr.L.M.Manocha, “Comparison Of Neural NetworkModels On Material Removal Rate Of C-Sic” International Journal Of Design And Manufacturing Technology(IJDMT) Volume 3, Issue 1, 2012, pp. 1 – 10, ISSN Print: 0976 – 6995, ISSN Online: 0976 – 7002[11] Dharmendra Kumar Singh, Dr.Moushmi Kar And Dr.A.S.Zadgaonkar, “Analysis Of Generated HarmonicsDue To Transformer Load On Power System Using Artificial Neural Network” International Journal ofElectrical Engineering & Technology (IJEET) Volume 4, Issue 1, 2013, pp. 81 – 90, ISSN PRINT: 0976-6545,ISSN ONLINE: 0976-6553.

×