Course notes (.ppt)

1,788 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,788
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
33
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Course notes (.ppt)

  1. 1. Neural Networks • Resources – Chapter 19, textbook • Sections 19.1-19.6
  2. 2. Neuroanatomy Metaphor • Neural networks (aka connectionist, PDP, artificial neural networks, ANN) – Rough approximation to animal nervous system – See systems such as NEURON for modeling at more biological levels of detail; http://neuron.duke.edu/ • Neuron components in brains – Soma (cell body); dendritic tree – Axon: sends signal downstream – Synapses • Receive incoming signals from upstream neurons • Connections on dendrites, cell body, axon, synapses • Neurotransmitter mechanisms
  3. 3. ‘a’ or ‘the’ brain? • Are we using computer models of neurons to model ‘the’ brain or model ‘a’ brain?
  4. 4. Neuron Firing Process 1) Synapse receives incoming signals, change electrical (ionic) potential of cell body 2) When a potential of cell body reaches some limit, neuron “fires”, electrical signal (action potential) sent down axon 3) Axon propagates signal to other neurons, downstream
  5. 5. What is represented by a biological neuron? • Cell body sums electrical potentials from incoming signals – Serves as an accumulator function over time – But “as a rule many impulses must reach a neuron almost simultaneously to make it fire” (p. 33, Brodal, 1992; italics added) • Synapses have varying effects on cell potential – Synaptic strength
  6. 6. ANN (Artificial Neural Nets) • Approximation of biological neural nets by ANN’s – No direct model of accumulator function – Synaptic strength • Approximate with connection weights (real numbers) – Spiking of output • Approximate with non-linear activation functions • Neural units – Represent activation values (numbers) – Represent inputs, and outputs (numbers)
  7. 7. Graphical Notation & Terms • Circles – Are neural units – Metaphor for nerve cell body • Arrows – Represent synaptic connections from one unit to another – These are often called weights and represented with a single value (e.g., real value) One layer of neural units Another layer of neural units
  8. 8. Another Example: 8 units in each layer, fully connected network
  9. 9. Units & Weights • Units – Sometimes notated with unit numbers • Weights – Sometimes give by symbols – Sometimes given by numbers – Always represent numbers – May be boolean valued or real valued 1 2 3 4 0.3 -0.1 2.1 -1.1 1 1 Unitnumbers Unitnumber 1 2 3 4 W1,1 W1,2 W1,3 W1,4
  10. 10. Computing with Neural Units • Inputs are presented to input units • How do we generate outputs • One idea – Summed Weighted Inputs 1 2 3 4 0.3 -0.1 2.1 -1.1 Input: (3, 1, 0, -2) Processing: 3(0.3) + 1(-0.1) + 0(2.1) + -1.1(-2) = 0.9 + (-0.1) + 2.2 Output: 3
  11. 11. Activation Functions • Usually, don’t just use weighted sum directly • Apply some function to the weighted sum before it is used (e.g., as output) • Call this the activation function • Step function could be a good simulation of a biological neuron spiking       < ≥ = θ θ x0 x1 )( if if xf θ Is called the threshold
  12. 12. Step Function Example • Let Θ = 4 1 2 3 4 0.3 -0.1 2.1 -1.1 4 1 0)3( =f
  13. 13. Another Activation Function: The Sigmoidal • The math of some neural nets requires that the activation function be continuously differentiable • A sigmoidal function often used to approximate the step function x e xf σ− + = 1 1 )( σ Is the steepness parameter
  14. 14. Sigmoidal Example 0.3 -0.1 2.1 -1.1 95. 1 1 )3( ≈ + = −x e f 1=σ x e xf − + = 1 1 )( Input: (3, 1, 0, -2)
  15. 15. Sigmoidal 0 0.2 0.4 0.6 0.8 1 1.2 -5 -4.4 -3.8 -3.2 -2.6 -2 -1.4 -0.8 -0.2 0.4 1 1.6 2.2 2.8 3.4 4 4.6 1/(1+exp(-x))) 1/(1+exp(-10*x)))
  16. 16. Another Example • A two weight layer, feedforward network • Two inputs, one output, one ‘hidden’ unit 0.5 -0.5 Input: (3, 1) x e xf − + = 1 1 )( 0.75 What is the output?
  17. 17. Computing in Multilayer Networks • Start at leftmost layer – Compute activations based on inputs • Then work from left to right, using computed activations as inputs to next layer • Example solution – Activation of hidden unit f(0.5(3) + -0.5(1)) = f(1.5 – 0.5) = f(1) = 0.731 – Output activation f(0.731(0.75)) = f(0.548) = .634 x e xf − + = 1 1 )(
  18. 18. Notation for Weighted Sums Weight (scalar) from unit j in left layer to unit i in right layer jiW , Activation value of unit k in layer l; layers increase in number from left to rightlka , )( 1 ,,1, ∑=+ = n i lijilk aWfa W1,1 W1,2 W1,3 W1,4 1,2a 1,3a 1,4a 2,1a 1,1a
  19. 19. Notation == ∑ = )( 1 1,,12,1 n j jjaWfa )( 1,44,11,33,11,22,11,11,1 aWaWaWaWf +++ 1,4a W1,1 W1,2 W1,3 W1,4 1,2a 1,3a 2,1a 1,1a
  20. 20. Notation iW Row vector of incoming weights for unit i ia Column vector of activation values of units connected to unit i
  21. 21. Example [ ]4,13,12,11,11 WWWWW = [ ]             = 1,4 1,3 1,2 1,1 4,13,12,11,111 a a a a WWWWaW             = 4 3 2 1 1 a a a a a Recall: multiplying a n×r with a r×m matrix produces an n×m matrix, C, where each element in that n×m matrix Ci,j is produced as the scalar product of row i of the left and column j of the right 1,4a W1,1 W1,2 W1,3 W1,4 1,2a 1,3a 2,1a 1,1a
  22. 22. Scalar Result: Summed Weighted Input 1,44,11,33,11,22,11,11,1 aWaWaWaW +++= 1×4 row vector 4×1 column vector 1×1 matrix (scalar)[ ] =             = 1,4 1,3 1,2 1,1 4,13,12,11,111 a a a a WWWWaW
  23. 23. Computing New Activation Value Where: f(x) is the activation function, e.g., the sigmoid function )( iiaWfa = )( 11aWfa = For the case we were considering: In the general case: )( 1,44,11,33,11,22,11,11,1 aWaWaWaWfa +++=
  24. 24. Example • Compute the output value • Draw the corresponding ANN [ ] =           ) 3 2 1 1-0.54.0(f
  25. 25. ANN Solving the Equality Problem for 2 Bits x1 x2 y1 y2 z1 Network Architecture x1 x2 z1 0 0 1 0 1 0 1 0 0 1 1 1 What weights solve this problem? Goal outputs:
  26. 26. Approximate Solution http://www.d.umn.edu/~cprince/courses/cs5541fall02/lectures/neural-networks/ x1 x2 y1 y2 z1 Network Architecture x1 x2 z1 0 0 .925 0 1 .192 1 0 .19 1 1 .433 Actual network results: Weights w_x1_y1 w_x1_y2 w_x2_y1 w_x2_y2 -1.8045 -7.7299 -1.8116 -7.6649 w_y1_z1 w_y2_z1 -10.3022 15.3298
  27. 27. How well did this approximate the goal function? • Categorically – For inputs x1=0, x2=0 and x1=1, x2=1, the output of the network was always greater than for inputs x1=1, x2=0 and x1=0, x2=1 • Summed squared error 2 1 )( s mplesnumTrainSa s s putDesiredOututActualOutp −∑=
  28. 28. • Compute the summed squared error for our example 2 1 )( s mplesnumTrainSa s s putDesiredOututActualOutp −∑= x1 x2 z1 0 0 .925 0 1 .192 1 0 .19 1 1 .433
  29. 29. Solution Expected Actual x1 x2 z1 z1 squared error 0 0 1 0.925 0.005625 0 1 0 0.192 0.036864 1 0 0 0.19 0.0361 1 1 1 0.433 0.321489 0.400078Sum squared error =
  30. 30. Weight Matrix • Row vector provides weights for a single unit in “right” layer • A weight matrix provides all weights connecting “left” layer to “right” layer • Let W be a n×r weight matrix – Row vector i in matrix connects unit i on “right” layer to units in “left” layer – n units in layer to “right” – r units in layer to “left”
  31. 31. Notation ia The vector of activation values of layer to “left”; an r×1 column vector (same as before) iaW n×1 column vector; summed weights for “right” layer )( iaWf n×1 - New activation values for “right” layer Function f is now taken as applying to elements of a matrix
  32. 32. Example ) 75. 4.0 23 02 11.1 34 0.31 (                       − −f Updating hidden layer activation values ) 1 3 3 2 1. 6.3310 56471. 4.1322 (                           f Updating output activation values Draw the architecture (units and arcs representing weights) of the connectionist model
  33. 33. Answer • 2 input units • 5 hidden layer units • 3 output units • Fully connected, feedforward network
  34. 34. Bias Weights • Used to provide a train-able threshold W1,1 W1,2 W1,3 W1,4 1 b b is treated as another weight; but connected to a unit with constant activation value

×