Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Artificial Neural Network

3,390 views

Published on

.

Published in: Science
  • Sex in your area is here: ♥♥♥ http://bit.ly/2Q98JRS ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating for everyone is here: ♥♥♥ http://bit.ly/2Q98JRS ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Artificial Neural Network

  1. 1. An Introduction to Artificial Neural Network Dr Iman Ardekani
  2. 2. Biological Neurons Modeling Neurons McCulloch-Pitts Neuron Network Architecture Learning Process Perceptron Linear Neuron Multilayer Perceptron Content
  3. 3. Ramón-y-Cajal (Spanish Scientist, 1852~1934): 1. Brain is composed of individual cells called neurons. 2. Neurons are connected to each others by synopses. Biological Neurons
  4. 4. Neurons Structure (Biology) Biological Neurons Dendrites Cell body Nucleus Axon Axon Terminals
  5. 5. Synaptic Junction (Biology) Biological Neurons Synapse
  6. 6. Neurons Function (Biology) 1. Dendrites receive signal from other neurons. 2. Neurons can process (or transfer) the received signals. 3. Axon terminals deliverer the processed signal to other tissues. Biological Neurons What kind of signals? Electrical Impulse Signals
  7. 7. Modeling Neurons (Computer Science) Biological Neurons Input Outputs Process
  8. 8. Modeling Neurons Net input signal is a linear combination of input signals xi. Each Output is a function of the net input signal. Biological Neurons x1x2 x3 x4 x5 x6 x7 x8 x9 x10 y
  9. 9. - McCulloch and Pitts (1943) for introducing the idea of neural networks as computing machines - Hebb (1949) for inventing the first rule for self- organized learning - Rosenblass (1958) for proposing the perceptron as the first model for learning with a teacher Modelling Neurons
  10. 10. Net input signal received through synaptic junctions is net = b + Σwixi = b + WT X Weight vector: W =[w1 w2 … wm]T Input vector: X = [x1 x2 … xm]T Each output is a function of the net stimulus signal (f is called the activation function) y = f (net) = f(b + WTX) Modelling Neurons
  11. 11. General Model for Neurons Modelling Neurons y= f(b+ Σwnxn) x1 x2 xm . . . y w1 w2 wm b  f
  12. 12. Activation functions Modelling Neurons 1 1 1 1 1 1 -1 -1 -1 Good for classification Simple computation Continuous & Differentiable Threshold Function/ Hard Limiter Linear Function sigmoid Function
  13. 13. Sigmoid Function Modelling Neurons 1 sigmoid Function f(x) = 1 1 + e-ax = threshold function when a goes to infinity
  14. 14. McCulloch-Pitts Neuron Modelling Neurons y {0,1} y x1 x2 xm . . . y w1 w2 wm b 
  15. 15. Modelling Neurons y [0,1]  x1 x2 xm . . . y w1 w2 wm b
  16. 16. Single-input McCulloch-Pitts neurode with b=0, w1=-1 for binary inputs: Conclusion? McCulloch-Pitts Neuron x1 net y 0 0 1 1 -1 0 x1 y w1 b 
  17. 17. Two-input McCulloch-Pitts neurode with b=-1, w1=w2=1 for binary inputs: McCulloch-Pitts Neuron x1 x2 net y 0 0 ? ? 0 1 ? ? 1 0 ? ? 1 1 ? ? x1 y w1 b  x2 w2
  18. 18. Two-input McCulloch-Pitts neurode with b=-1, w1=w2=1 for binary inputs: McCulloch-Pitts Neuron x1 x2 net y 0 0 -1 0 0 1 0 1 1 0 0 1 1 1 1 1 x1 y w1 b  x2 w2
  19. 19. Two-input McCulloch-Pitts neurode with b=-2, w1=w2=1 for binary inputs : McCulloch-Pitts Neuron x1 x2 net y 0 0 ? ? 0 1 ? ? 1 0 ? ? 1 1 ? ? x1 y w1 b  x2 w2
  20. 20. Two-input McCulloch-Pitts neurode with b=-2, w1=w2=1 for binary inputs : McCulloch-Pitts Neuron x1 x2 net y 0 0 -2 0 0 1 -1 0 1 0 -1 0 1 1 0 1 x1 y w1 b  x2 w2
  21. 21. Every basic Boolean function can be implemented using combinations of McCulloch- Pitts Neurons. McCulloch-Pitts Neuron
  22. 22. the McCulloch-Pitts neuron can be used as a classifier that separate the input signals into two classes (perceptron): McCulloch-Pitts Neuron X1 X2 Class A (y=1) Class B (y=0) Class A  y=? y = 1  net = ? net  0  ? b+w1x1+w2x2  0 Class B  y=? y = 0  net = ? net < 0  ? b+w1x1+w2x2 < 0
  23. 23. Class A  b+w1x1+w2x2  0 Class B  b+w1x1+w2x2 < 0 Therefore, the decision boundary is a hyperline given by b+w1x1+w2x2 = 0 Where w1 and w2 come from? McCulloch-Pitts Neuron
  24. 24. Solution: More Neurons Required McCulloch-Pitts Neuron X2 X1 X1 and x2 X2 X1 X1 or x2 X2 X1 X1 xor x2 ?
  25. 25. Nonlinear Classification McCulloch-Pitts Neuron X1 X2 Class A (y=1) Class B (y=0)
  26. 26. Single Layer Feed-forward Network Network Architecture Single Layer: There is only one computational layer. Feed-forward: Input layer projects to the output layer not vice versa.  f  f  f Input layer (sources) Output layer (Neurons)
  27. 27. Multi Layer Feed-forward Network Network Architecture  f  f  f Input layer (sources) Hidden layer (Neurons)  f  f Output layer (Neurons)
  28. 28. Single Layer Recurrent Network Network Architecture  f  f  f
  29. 29. Multi Layer Recurrent Network Network Architecture  f  f  f  f  f
  30. 30. The mechanism based on which a neural network can adjust its weights (synaptic junctions weights): - Supervised learning: having a teacher - Unsupervised learning: without teacher Learning Processes
  31. 31. Supervised Learning Learning Processes Teacher Learner Environment  Input data Desired response Actual response Error Signal + -
  32. 32. Unsupervised Learning Learning Processes LearnerEnvironment Input data Neurons learn based on a competitive task. A competition rule is required (competitive-learning rule).
  33. 33. - The goal of the perceptron to classify input data into two classes A and B - Only when the two classes can be separated by a linear boundary - The perceptron is built around the McCulloch-Pitts Neuron model - A linear combiner followed by a hard limiter - Accordingly the neuron can produce +1 and 0 Perceptron x1 x2 xm . . . y wk wk wk b 
  34. 34. Equivalent Presentation Perceptron x1 x2 xm . . . y w1 w2 wm b  w0 1 net = WTX Weight vector: W =[w0 w1 … wm]T Input vector: X = [1 x1 x2 … xm]T
  35. 35. There exist a weight vector w such that we may state WTx > 0 for every input vector x belonging to A WTx ≤ 0 for every input vector x belonging to B Perceptron
  36. 36. Elementary perceptron Learning Algorithm 1) Initiation: w(0) = a random weight vector 2) At time index n, form the input vector x(n) 3) IF (wTx > 0 and x belongs to A) or (wTx ≤ 0 and x belongs to B) THEN w(n)=w(n-1) Otherwise w(n)=w(n-1)-ηx(n) 4) Repeat 2 until w(n) converges 1) Perceptron
  37. 37. - when the activation function simply is f(x)=x the neuron acts similar to an adaptive filter. - In this case: y=net=wTx - w=[w1 w2 … wm]T - x=[x1 x2 … xm]T Linear Neuron x1 x2 xm . . . y w1 w2 wm b 
  38. 38. Linear Neuron Learning (Adaptive Filtering) Linear Neuron Teacher Learner (Linear Neuron) Environment  Input data Desired Response d(n) Actual Response y(n) Error Signal + -
  39. 39. LMS Algorithm To minimize the value of the cost function defined as E(w) = 0.5 e2(n) where e(n) is the error signal e(n)=d(n)-y(n)=d(n)-wT(n)x(n) In this case, the weight vector can be updated as follows wi(n+1)=wi(n-1) - μ( ) Linear Neuron dE dwi
  40. 40. LMS Algorithm (continued) = e(n) = e(n) {d(n)-wT(n)x(n)} = -e(n) xi(n) wi(n+1)=wi(n)+μe(n)xi(n) w(n+1)=w(n)+μe(n)x(n) Linear Neuron dE dwi de(n) dwi d dwi
  41. 41. Summary of LMS Algorithm 1) Initiation: w(0) = a random weight vector 2) At time index n, form the input vector x(n) 3) y(n)=wT(n)x(n) 4) e(n)=d(n)-y(n) 5) w(n+1)=w(n)+μe(n)x(n) 6) Repeat 2 until w(n) converges Linear Neuron
  42. 42. - To solve the xor problem - To solve the nonlinear classification problem - To deal with more complex problems Multilayer Perceptron
  43. 43. - The activation function in multilayer perceptron is usually a Sigmoid function. - Because Sigmoid is differentiable function, unlike the hard limiter function used in the elementary perceptron. Multilayer Perceptron
  44. 44. Architecture Multilayer Perceptron  f  f  f Input layer (sources) Hidden layer (Neurons)  f  f Output layer (Neurons)
  45. 45. Architecture Multilayer Perceptron  f  f  f Inputs (from layer k-1) layer k Outputs (to layer k+1)
  46. 46. Consider a single neuron in a multilayer perceptron (neuron k) Multilayer Perceptron  f yk y0=1 y1 ym Wk,0 Wk,1 Wk,m . . . netk = Σwk,iyi yk= f (netk)
  47. 47. Multilayer perceptron Learning (Back Propagation Algorithm) Multilayer Perceptron Teacher Neuron j of layer k layer k-1  Input data Desired Response dk(n) Actual Response yk(n) Error Signal ek(n) + -
  48. 48. Back Propagation Algorithm: Cost function of neuron j: Ek = 0.5 e2 j Cost function of network: E = ΣEj = 0.5Σe2 j ek = dk – yk ek = dk – f(netk) ek = dk – f(Σwk,iyi) Multilayer Perceptron
  49. 49. Back Propagation Algorithm: Cost function of neuron k: Ek = 0.5 e2 k ek = dk – yk ek = dk – f(netk) ek = dk – f(Σwk,iyi) Multilayer Perceptron
  50. 50. Cost function of network: E = ΣEj = 0.5Σe2 j Multilayer Perceptron
  51. 51. Back Propagation Algorithm: To minimize the value of the cost function E(wk,i), the weight vector can be updated using a gradient based algorithm as follows wk,i(n+1)=wk,i (n) - μ( ) =? Multilayer Perceptron dE dwk,i dE dwk,j
  52. 52. Back Propagation Algorithm: = = = δk yk δk = = -ekf’(netk) Multilayer Perceptron dE dwk,i dE dek dek dyk dyk dnetk dnetk dwk,i dE dnetk dnetk dwk,i dE dek dek dyk dyk dnetk dE dek = ek dek dnetk = f’(netk) dek dyk = -1 dnetk dwk,i = yi Local Gradient
  53. 53. Back Propagation Algorithm: Substituting into the gradient-based algorithm: wk,i(n+1)=wk,i (n) – μ δk yk δk = -ekf’(netk) = - {dk- yk} f’(netk) = ? If k is an output neuron we have all the terms of δk When k is a hidden neuron? Multilayer Perceptron dE dwk,i
  54. 54. Back Propagation Algorithm: When k is hidden δk = = = f’(netk) =? Multilayer Perceptron dE dek dek dyk dyk dnetk dE dyk dyk dnetk dek dnetk = f’(netk) dE dyk dE dyk
  55. 55. = ? E = 0.5Σe2 j Multilayer Perceptron dE dyk dE dyk dej dyk dej dnetj dnetj dyk wjk -f’(netj) = -Σ ej f’(netj) wjk = -Σ δj wjk = Σ ej = Σ ej
  56. 56. We had δk =- Substituting = Σ δj wjk into δk results in δk = - f’(netk) Σ δj wjk which gives the local gradient for the hidden neuron k Multilayer Perceptron dE dyk f’(netk) dE dyk
  57. 57. Summary of Back Propagation Algorithm 1) Initiation: w(0) = a random weight vector At time index n, get the input data and 2) Calculate netk and yk for all the neurons 3) For output neurons, calculate ek=dk-yk 4) For output neurons, δk = -ekf’(netk) 5) For hidden neurons, δk = - f’(netk) Σ δj wjk 6) Update every neuron weights by wk,i(NEW)=wk,i (OLD) – μ δk yk 7. Repeat steps 2~6 for the next data set (next time index) Multilayer Perceptron
  58. 58. THE END

×