Neural Network

2,020 views
1,948 views

Published on

Published in: Education, Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,020
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
140
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Neural Network

  1. 1. Neural Network to solve Traveling Salesman Problem
  2. 2. Roadmap <ul><li>Hopfield Neural Network </li></ul><ul><li>Solving TSP using Hopfield Network </li></ul><ul><li>Modification of Hopfield Neural Network </li></ul><ul><li>Solving TSP using Concurrent Neural Network </li></ul><ul><li>Comparison between Neural Network and SOM for solving TSP </li></ul>
  3. 3. Background <ul><li>Neural Networks </li></ul><ul><ul><li>Computing device composed of processing elements called neurons </li></ul></ul><ul><ul><li>Processing power comes from interconnection between neurons </li></ul></ul><ul><li>Various models are Hopfield, Back propagation, Perceptron, Kohonen Net etc </li></ul>
  4. 4. Associative memory <ul><li>Associative memory </li></ul><ul><ul><li>Produces for any input pattern a similar stored pattern </li></ul></ul><ul><ul><li>Retrieval by part of data </li></ul></ul><ul><ul><li>Noisy input can be also recognized </li></ul></ul>Original Degraded Reconstruction
  5. 5. Hopfield Network <ul><li>Recurrent network </li></ul><ul><ul><li>Feedback from output to input </li></ul></ul><ul><li>Fully connected </li></ul><ul><ul><li>Every neuron connected to every other neuron </li></ul></ul>
  6. 6. Hopfield Network <ul><li>Symmetric connections </li></ul><ul><ul><li>Connection weights from unit i to unit j and from unit j to unit i are identical for all i and j </li></ul></ul><ul><ul><li>No self connection, so weight matrix is 0-diagonal and symmetric </li></ul></ul><ul><li>Logic levels are +1 and -1 </li></ul>
  7. 7. Computation <ul><li>For any neuron i, at an instant t input is </li></ul><ul><li>Σ j = 1 to n, j≠i w ij σ j (t) </li></ul><ul><li>σ j (t) is the activation of the j th neuron </li></ul><ul><li>Threshold function θ = 0 </li></ul><ul><li>Activation σ i (t+1)=sgn( Σ j=1 to n, j≠i w ij σ j (t)) </li></ul><ul><li>where </li></ul>Sgn(x) = +1 x>0 Sgn(x) = -1 x<0
  8. 8. Modes of operation <ul><li>Synchronous </li></ul><ul><ul><li>All neurons are updated simultaneously </li></ul></ul><ul><li>Asynchronous </li></ul><ul><ul><li>Simple : Only one unit is randomly selected at each step </li></ul></ul><ul><ul><li>General : Neurons update themselves independently and randomly based on probability distribution over time. </li></ul></ul>
  9. 9. Stability <ul><li>Issue of stability arises since there is a feedback in Hopfield network </li></ul><ul><li>May lead to fixed point, limit cycle or chaos </li></ul><ul><ul><li>Fixed point : unique point attractor </li></ul></ul><ul><ul><li>Limit cycles : state space repeats itself in periodic cycles </li></ul></ul><ul><ul><li>Chaotic : aperiodic strange attractor </li></ul></ul>
  10. 10. Procedure <ul><li>Store and stabilize the vector which has to be part of memory. </li></ul><ul><li>Find the value of weight w ij , for all i, j such that : </li></ul><ul><ul><li>< σ 1 , σ 2 , σ 3 …… σ N > is stable in Hopfield Network of N neurons. </li></ul></ul>
  11. 11. Weight learning <ul><li>Weight learning is given by </li></ul><ul><ul><li>w ij = 1/(N-1) σ i σ j </li></ul></ul><ul><ul><li>1/(N-1) is Normalizing factor </li></ul></ul><ul><li>σ i σ j derives from Hebb’s rule </li></ul><ul><ul><li>If two connected neurons are ON then weight of the connection is such that mutual excitation is sustained. </li></ul></ul><ul><ul><li>Similarly, if two neurons inhibit each other then the connection should sustain the mutual inhibition. </li></ul></ul>
  12. 12. Multiple Vectors <ul><li>If multiple vectors need to be stored in memory like </li></ul><ul><ul><li>< σ 1 1 , σ 2 1 , σ 3 1 …… σ N 1 > </li></ul></ul><ul><ul><li>< σ 1 2 , σ 2 2 , σ 3 2 …… σ N 2 > </li></ul></ul><ul><ul><li>……………………………… . </li></ul></ul><ul><ul><li>< σ 1 p , σ 2 p , σ 3 p …… σ N p > </li></ul></ul><ul><ul><li>Then the weight are given by: </li></ul></ul><ul><ul><li>w ij = 1/(N-1) Σ m=1 to p σ i m σ j m </li></ul></ul>
  13. 13. Energy <ul><li>Energy is associated with the state of the system. </li></ul><ul><li>Some patterns need to be made stable this corresponds to minimum energy state of the system. </li></ul>
  14. 14. Energy function <ul><li>Energy at state σ ’ = < σ 1 , σ 2 , σ 3 …… σ N > </li></ul><ul><ul><li>E( σ ’) = -½ Σ i Σ j≠i w ij σ i σ j </li></ul></ul><ul><li>Let the p th neuron change its state from σ p initial to σ p final so </li></ul><ul><ul><li>E initial = -½ Σ j≠p w pj σ p initial σ j + T </li></ul></ul><ul><ul><li>E final = -½ Σ j≠p w pj σ p final σ j + T </li></ul></ul><ul><ul><li>Δ E = E final – E initial </li></ul></ul><ul><ul><ul><li>T is independent of σ p </li></ul></ul></ul>
  15. 15. Continued… <ul><ul><li>Δ E = - ½ ( σ p final - σ p initial ) Σ j≠p w pj σ j </li></ul></ul><ul><ul><li>i.e. Δ E = -½ Δσ p Σ j≠p w pj σ j </li></ul></ul><ul><ul><li>Thus: Δ E = -½ Δσ p x (netinput p ) </li></ul></ul><ul><li>If p changes from +1 to -1 then Δσ p is negative and netinput p is negative and vice versa. </li></ul><ul><li>So, Δ E is always negative . Thus energy always decreases when neuron changes state. </li></ul>
  16. 16. Applications of Hopfield Nets <ul><li>Hopfield nets are applied for Optimization problems. </li></ul><ul><li>Optimization problems maximize or minimize a function. </li></ul><ul><li>In Hopfield Network the energy gets minimized. </li></ul>
  17. 17. Traveling Salesman Problem <ul><li>Given a set of cities and the distances between them, determine the shortest closed path passing through all the cities exactly once. </li></ul>
  18. 18. Traveling Salesman Problem <ul><li>One of the classic and highly researched problem in the field of computer science. </li></ul><ul><li>Decision problem “Is there a tour with length less than k&quot; is NP - Complete </li></ul><ul><li>Optimization problem “What is the shortest tour?” is NP - Hard </li></ul>
  19. 19. Hopfield Net for TSP <ul><li>N cities are represented by an N X N matrix of neurons </li></ul><ul><li>Each row has exactly one 1 </li></ul><ul><li>Each column has exactly one 1 </li></ul><ul><li>Matrix has exactly N 1’s </li></ul>σ kj = 1 if city k is in position j σ kj = 0 otherwise
  20. 20. Hopfield Net for TSP <ul><li>For each element of the matrix take a neuron and fully connect the assembly with symmetric weights </li></ul><ul><li>Finding a suitable energy function E </li></ul>
  21. 21. Determination of Energy Function <ul><li>E function for TSP has four components satisfying four constraints </li></ul><ul><li>Each city can have no more than one </li></ul><ul><li>position i.e. each row can have no more </li></ul><ul><li>than one activated neuron </li></ul><ul><ul><li>E 1 = A/2 Σ k Σ i Σ j≠i σ ki σ kj A - Constant </li></ul></ul>
  22. 22. Energy Function (Contd..) <ul><li>Each position contains no more than one city i.e. each column contains no more than one activated neuron </li></ul><ul><li>E 2 = B/2 Σ j Σ k Σ r≠k σ kj σ rj B - constant </li></ul>
  23. 23. Energy Function (Contd..) <ul><li>There are exactly N entries in the output matrix i.e. there are N 1’s in the output matrix </li></ul><ul><li>E 3 = C/2 (n - Σ k Σ i σ ki ) 2 C - constant </li></ul>
  24. 24. Energy Function (cont..) <ul><li>Fourth term incorporates the requirement of the shortest path </li></ul><ul><li>E 4 = D/2 Σ k Σ r≠k Σ j d kr σ kj ( σ r(j+1) + σ r(j-1) ) </li></ul><ul><li>where d kr is the distance between city-k and city-r </li></ul><ul><li>E total = E 1 + E 2 + E 3 + E 4 </li></ul>
  25. 25. Energy Function (cont..) <ul><li>Energy equation is also given by </li></ul><ul><li> E= -½ Σ ki Σ rj w (ki)(rj) σ ki σ rj </li></ul><ul><ul><li>σ ki – City k at position i </li></ul></ul><ul><ul><li>σ rj – City r at position j </li></ul></ul><ul><li>Output function σ ki </li></ul><ul><ul><li> σ ki = ½ ( 1 + tanh(u ki /u 0 )) </li></ul></ul><ul><ul><li>u 0 is a constant </li></ul></ul><ul><ul><li>u ki is the net input </li></ul></ul>
  26. 26. Weight Value <ul><li>Comparing above equations with the energy equation obtained previously </li></ul><ul><li>W (ki)(rj) = -A δ kr (1 – δ rj ) - B δ ij (1 – δ kr ) –C –Dd kr ( δ j(i+1) + δ j(i-1) ) </li></ul><ul><li>Kronecker Symbol : δ kr </li></ul><ul><ul><li>δ kr = 1 when k = r </li></ul></ul><ul><ul><li>δ kr = 0 when k ≠ r </li></ul></ul>
  27. 27. Observation <ul><li>Choice of constants A,B,C and D that provide a good solution vary between </li></ul><ul><ul><li>Always obtain legitimate loops (D is small relative to A, B and C) </li></ul></ul><ul><ul><li>Giving heavier weights to the distances (D is large relative to A, B and C) </li></ul></ul>
  28. 28. Observation (cont..) <ul><li>Local minima </li></ul><ul><ul><li>Energy function full of dips, valleys and local minima </li></ul></ul><ul><li>Speed </li></ul><ul><ul><li>Fast due to rapid computational capacity of network </li></ul></ul>
  29. 29. Concurrent Neural Network <ul><li>Proposed by N. Toomarian in 1988 </li></ul><ul><li>It requires N(log(N)) neurons to compute TSP of N cities. </li></ul><ul><li>It also has a much higher probability to reach a valid tour. </li></ul>
  30. 30. Objective Function <ul><li>Aim is to minimize the distance between city k at position i and city r at position i+1 </li></ul><ul><li>E i = Σ k≠r Σ r Σ i δ ki δ r(i+1) d kr </li></ul><ul><li>Where δ is the Kronecers Symbol </li></ul>
  31. 31. Cont … <ul><li>E i = 1/N 2 Σ k≠r Σ r Σ i d kr Π i= 1 to ln(N) [1 + (2 ע i – 1) σ ki ] [1 + (2µ i – 1) σ ri ] </li></ul><ul><li>Where (2µ i – 1) = (2 ע i – 1) [1 – Π j= 1 to i-1 ע i ] </li></ul><ul><li>Also to ensure that 2 cities don’t occupy same position </li></ul><ul><li>E error = Σ k≠r Σ r δ kr </li></ul>
  32. 32. Solution <ul><li>E error will have a value 0 for any valid tour. </li></ul><ul><li>So we have a constrained optimization problem to solve. </li></ul><ul><li>E = E i + λ E error </li></ul><ul><li>λ is the Lagrange multiplier to be calculated form the solution. </li></ul>
  33. 33. Minimization of energy function <ul><li>Minimizing Energy function which is in terms of σ ki </li></ul><ul><li>Algorithm is an iterative procedure which is usually used for minimization of quadratic functions </li></ul><ul><li>The iteration steps are carried out in the direction of steepest decent with respect to the energy function E </li></ul>
  34. 34. Minimization of energy function <ul><li>Differentiating the energy </li></ul><ul><li>dU ki /dt = - δ E/ δ σ ki = - δ E i / δ σ ki - λδ E error / δ σ ki </li></ul><ul><li>d λ /dt = ± δ E/ δλ = ± E error </li></ul><ul><li>σ ki = tanh( α U ki ) , α – const. </li></ul>
  35. 35. Implementation <ul><li>Initial Input Matrix and the value of λ is randomly selected and specified </li></ul><ul><li>At each iteration, new value of σ ki and λ is calculated in the direction of steepest descent of energy function </li></ul><ul><li>Iterations will stop either when convergence is achieved or when the number of iterations exceeds a user specified number </li></ul>
  36. 36. Comparison – Hopfield vs Concurrent NN <ul><li>Converges faster than Hopfield Network </li></ul><ul><li>Probability to achieve valid tour is higher than Hopfield Network </li></ul><ul><li>Hopfield doesn’t have systematic way to determine the constant terms. </li></ul>
  37. 37. Comparison – SOM and Concurrent NN <ul><li>Data set consists of 52 cities in Germany and its subset of 15 cities. </li></ul><ul><li>Both algorithms were run for 80 times on 15 city data set. </li></ul><ul><li>52 city dataset could be analyzed only using SOM while Concurrent Neural Net failed to analyze this dataset. </li></ul>
  38. 38. Result <ul><li>Concurrent neural network always converged and never missed any city, where as SOM is capable of missing cities. </li></ul><ul><li>Concurrent Neural Network is very erratic in behavior , whereas SOM has higher reliability to detect every link in smallest path. </li></ul><ul><li>Overall Concurrent Neural Network performed poorly as compared to SOM. </li></ul>
  39. 39. Shortest path generated Concurrent Neural Network (2127 km) Self Organizing Maps (1311km)
  40. 40. Behavior in terms of probability Concurrent Neural Network Self Organizing Maps
  41. 41. Conclusion <ul><li>Hopfield Network can also be used for optimization problems. </li></ul><ul><li>Concurrent Neural Network performs better than Hopfield network and uses less neurons. </li></ul><ul><li>Concurrent and Hopfield Neural Network are less efficient than SOM for solving TSP. </li></ul>
  42. 42. References <ul><li>N. K. Bose and P. Liang, ” Neural Network Fundamentals with Graphs, Algorithms and Applications”, Tata McGraw Hill Publication, 1996 </li></ul><ul><li>P. D. Wasserman, “Neural computing: theory and practice” , Van Nostrand Reinhold Co., 1989 </li></ul><ul><li>N. Toomarian, “ A Concurrent Neural Network algorithm for the Traveling Salesman Problem ”, ACM Proceedings of the third conference on Hypercube concurrent computers and applications, pp. 1483-1490, 1988. </li></ul>
  43. 43. References <ul><li>R. Reilly, “ Neural Network approach to solving the Traveling Salesman Problem ”, Journals of Computer Science in Colleges, pp. 41-61,October 2003 </li></ul><ul><li>Wolfram Research inc., “ Tutorial on Neural Networks ”, http://documents.wolfram.com/applications/neuralnetworks/NeuralNetworkTheory/2.7.0.html , 2004 </li></ul><ul><li>Prof. P. Bhattacharyya, “ Introduction to computing with Neural Nets ”, http://www.cse.iitb.ac.in/~pb/Teaching.html . </li></ul>
  44. 45. NP-complete NP-hard <ul><li>When a decision version of a combinatorial optimization problem is proved to belong to the class of NP-complete problems, which includes well-known problems such as satisfiability, traveling salesman , the bin packing problem , etc., then the optimization version is NP-hard. </li></ul>
  45. 46. NP-complete NP-hard <ul><li>“Is there a tour with length less than k&quot; is NP-complete: </li></ul><ul><li>It is easy to determine if a proposed certificate has length less than k </li></ul><ul><li>The optimization problem : </li></ul><ul><li>&quot;what is the shortest tour?&quot;, is NP-hard Since there is no easy way to determine if a certificate is the shortest. </li></ul>
  46. 47. Path lengths Concurrent Neural Network Self Organizing Maps

×