Your SlideShare is downloading. ×
Nn3
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Nn3

98
views

Published on

neural neworks

neural neworks

Published in: Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
98
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Neural Networks Prof. Ruchi Sharma Bharatividyapeeth College of Engineering New Delhi Lecture 1 1
  • 2. Plan Requirements Background  Brains and Computers  Computational Models  Learnability vs Programming  Representability vs Training Rules Abstractions of Neurons Abstractions of Networks  Completeness of 3-Level Mc- Cullough-Pitts Neurons Learnability in Perceptrons Lecture 1 2
  • 3. Brains and Computers What is computation? Basic Computational Model (Like C, Pascal, C++, etc) (See course on Models of Computation.) Analysis of problem and reduction to algorithm. Implementation of algorithm by Programmer. Lecture 1 3
  • 4. ComputationBrain also computes. ButIts programming seems different.Flexible, Fault Tolerant (neurons die every day),Automatic Programming,Learns from examples,Generalizability Lecture 1 4
  • 5. Brain vs. Computer Brain works on slow components (10**-3 sec) Computers use fast components (10**-9 sec) Brain more efficient (few joules per operation) (factor of 10**10.) Uses massively parallel mechanism. ****Can we copy its secrets? Lecture 1 5
  • 6. Brain vs Computer Areas that Brain is better: Sense recognition and integration Working with incomplete information Generalizing Learning from examples Fault tolerant (regular programming is notoriously fragile.) Lecture 1 6
  • 7. AI vs. NNs AI relates to cognitive psychology  Chess, Theorem Proving, Expert Systems, Intelligent agents (e.g. on Internet) NNs relates to neurophysiology  Recognition tasks, Associative Memory, Learning Lecture 1 7
  • 8. How can NNs work? Look at Brain: 10**10 neurons (10 Gigabytes) . 10 ** 20 possible connections with different numbers of dendrites (reals) Actually about 6x 10**13 connections (I.e. 60,000 hard discs for one photo of contents of one brain!) Lecture 1 8
  • 9. Brain Complex Non-linear (more later!) Parallel Processing Fault Tolerant Adaptive Learns Generalizes Self Programs Lecture 1 9
  • 10. Abstracting Note: Size may not be crucial (apylsia or crab does many things) Look at simple structures first Lecture 1 10
  • 11. Real and ArtificialNeurons Lecture 1 11
  • 12. One Neuron McCullough-Pitts  This is very complicated. But abstracting the details,wex1 have w1x2 w2 Σ wn Ιntegrate Τhreshold xn Integrate-and-fire Neuron Lecture 1 12
  • 13. Representability What functions can be represented by a network of Mccullough-Pitts neurons? Theorem: Every logic function of an arbitrary number of variables can be represented by a three level network of neurons. Lecture 1 13
  • 14. Proof Show simple functions: and, or, not, implies Recall representability of logic functions by DNF form. Lecture 1 14
  • 15. AND, OR, NOTx1 1.0 w1 1.0x2 w2 Σ wn Ιntegrate Τhreshold xn 1.5 Lecture 1 15
  • 16. AND, OR, NOTx1 1.0 w1 1.0x2 w2 Σ wn Ιntegrate Τhreshold xn .9 Lecture 1 16
  • 17. Lecture 1 17
  • 18. AND, OR, NOTx1 -1.0 w1x2 w2 Σ wn Ιntegrate Τhreshold xn -.5 Lecture 1 18
  • 19. DNF and All Functions Theorem  Any logic (boolean) function of any number of variables can be represented in a network of McCullough-Pitts neurons.  In fact the depth of the network is three.  Proof: Use DNF and And, Or, Not representation Lecture 1 19
  • 20. Other Questions?What if we allow REAL numbers as inputs/outputs?What real functions can be represented?What if we modify threshold to some other function; so output is not {0,1}. What functions can be represented? Lecture 1 20
  • 21. Representability andGeneralizability Lecture 1 21
  • 22. Learnability andGeneralizability The previous theorem tells us that neural networks are potentially powerful, but doesn’t tell us how to use them. We desire simple networks with uniform training rules. Lecture 1 22
  • 23. One Neuron(Perceptron) What can be represented by one neuron? Is there an automatic way to learn a function by examples? Lecture 1 23
  • 24. Perceptron TrainingRule Loop: Take an example. Apply to network. If correct answer, return to loop. If incorrect, go to FIX.FIX: Adjust network weights by input example . Go to Loop. Lecture 1 24
  • 25. Example ofPerceptronLearning X1 = 1 (+) x2 = -.5 (-) X3 = 3 (+) x4 = -2 (-) Expanded Vector  Y1 = (1,1) (+) y2= (-.5,1)(-)  Y3 = (3,1) (+) y4 = (-2,1) (-)Random initial weight (-2.5, 1.75) Lecture 1 25
  • 26. Graph of Learning Lecture 1 26
  • 27. Trace of Perceptron  W1 y1 = (-2.5,1.75) (1,1)<0 wrong W2 = w1 + y1 = (-1.5, 2.75)  W2 y2 = (-1.5, 2.75)(-.5, 1)>0 wrong W3 = w2 – y2 = (-1, 1.75)  W3 y3 = (-1,1.75)(3,1) <0 wrong W4 = w4 + y3 = (2, 2.75) Lecture 1 27
  • 28. PerceptronConvergenceTheorem If the concept is representable in a perceptron then the perceptron learning rule will converge in a finite amount of time. (MAKE PRECISE and Prove) Lecture 1 28
  • 29. What is a NeuralNetwork? What is an abstract Neuron? What is a Neural Network? How are they computed? What are the advantages? Where can they be used?  Agenda  What to expect Lecture 1 29
  • 30. Perceptron Algorithm Start: Choose arbitrary value for weights, W Test: Choose arbitrary example X If X pos and WX >0 or X neg and WX <= 0 go to Test Fix:  If X pos W := W +X;  If X negative W:= W –X;  Go to Test; Lecture 1 30
  • 31. Perceptron Conv.Thm. Let F be a set of unit length vectors. If there is a vector V* and a value e>0 such that V*X > e for all X in F then the perceptron program goes to FIX only a finite number of times. Lecture 1 31
  • 32. Applying Algorithm to“And” W0 = (0,0,1) or random X1 = (0,0,1) result 0 X2 = (0,1,1) result 0 X3 = (1,0, 1) result 0 X4 = (1,1,1) result 1 Lecture 1 32
  • 33. “And” continued  Wo X1 > 0 wrong;  W1 = W0 – X1 = (0,0,0)  W1 X2 = 0 OK (Bdry)  W1 X3 = 0 OK  W1 X4 = 0 wrong;  W2 = W1 +X4 = (1,1,1)  W3 X1 = 1 wrong  W4 = W3 –X1 = (1,1,0)  W4X2 = 1 wrong  W5 = W4 – X2 = (1,0, -1)  W5 X3 = 0 OK  W5 X4 = 0 wrong  W6 = W5 + X4 = (2, 1, 0)  W6 X1 = 0 OK  W6 X2 = 1 wrong  W7 = W7 – X2 = (2,0, -1) Lecture 1 33
  • 34. “And” page 3 W8 X3 = 1 wrong W9 = W8 – X3 = (1,0, 0) W9X4 = 1 OK W9 X1 = 0 OK W9 X2 = 0 OK W9 X3 = 1 wrong W10 = W9 – X3 = (0,0,-1) W10X4 = -1 wrong W11 = W10 + X4 = (1,1,0) W11X1 =0 OK W11X2 = 1 wrong W12 = W12 – X2 = (1,0, -1) Lecture 1 34
  • 35. Proof of Conv Theorem Note: 1. By hypothesis, there is a δ >0such that V*X > δ for all x ε F 1. Can eliminate threshold (add additional dimension to input) W(x,y,z) > threshold if and only if W* (x,y,z,1) > 0 2. Can assume all examples are positive ones (Replace negative examples by their negated vectors) W(x,y,z) <0 if and only if W(-x,-y,-z) > 0. Lecture 1 35
  • 36. Proof (cont). Consider quotient V*W/|W|.(note: this is multidimensionalcosine between V* and W.)Recall V* is unit vector .Quotient <= 1. Lecture 1 36
  • 37. Proof(cont)Now each time FIX is visited W changes via ADD.V* W(n+1) = V*(W(n) + X) = V* W(n) + V*X >= V* W(n) + δHenceV* W(n) >= n δ (∗) Lecture 1 37
  • 38. Proof (cont) Now consider denominator: |W(n+1)| = W(n+1)W(n+1) = ( W(n) + X)(W(n) + X) = |W(n)| **2 + 2W(n)X + 1 (recall |X| = 1 and W(n)X < 0 since X is positive example and we are in FIX) < |W(n)|**2 + 1So after n times |W(n+1)|**2 < n (**) Lecture 1 38
  • 39. Proof (cont) Putting (*) and (**) together: Quotient = V*W/|W| > nδ/ sqrt(n)Since Quotient <=1 this means n < (1/δ)∗∗2.This means we enter FIX a bounded number of times. Q.E.D. Lecture 1 39
  • 40. Geometric Proof See hand slides. Lecture 1 40
  • 41. Perceptron Diagram 1 Solution OKNo examples No examples No examples Lecture 1 41
  • 42. Solution OKNo examples No examples No examples Lecture 1 42
  • 43. Lecture 1 43
  • 44. Additional Facts Note: If X’s presented in systematic way, then solution W always found. Note: Not necessarily same as V* Note: If F not finite, may not obtain solution in finite time Can modify algorithm in minor ways and stays valid (e.g. not unit but bounded examples); changes in W(n). Lecture 1 44
  • 45. PerceptronConvergenceTheorem If the concept is representable in a perceptron then the perceptron learning rule will converge in a finite amount of time. (MAKE PRECISE and Prove) Lecture 1 45
  • 46. Important Points Theorem only guarantees result IF representable! Usually we are not interested in just representing something but in its generalizability – how will it work on examples we havent seen! Lecture 1 46
  • 47. Percentage of BooleanFunctions Representible bya Perceptron Input Perceptron Functions 1 4 4 2 16 14 3 104 256 4 1,882 65,536 5 94,572 10**9 6 15,028,134 10**19 7 8,378,070,864 10**38 8 17,561,539,552,946 10**77 Lecture 1 47
  • 48. Generalizability Typically train a network on a sample set of examples Use it on general class Training can be slow; but execution is fast. Lecture 1 48
  • 49. Perceptron •weights ∑w x i i > threshold ⇔ A in receptive field kdkdkfjlll ∑w x i i > θ ⇔ The letter A is in the recept •Pattern Identification •(Note: Neuron is trained) Lecture 1 49
  • 50. What wont work? Example: Connectedness with bounded diameter perceptron. Compare with Convex with(use sensors of order three). Lecture 1 50
  • 51. Biases andThresholds We can replace the threshold with a bias. n ∑ wx >θ i i A bias acts i= 1 n exactly as a weight on a ∑ w x -θ > 0 i= 1 i i n connection from a unit ∑ wx > 0 i= 0 i i w0 = - θ x0 = 1 whose activation is always 1. Lecture 1 51
  • 52. Perceptron Loop :  Take an example and apply to network.  If correct answer – return to Loop.  If incorrect – go to Fix. Fix :  Adjust network weights by input example.  Go to Loop. Lecture 1 52
  • 53. Perceptron Algorithm v x∈ FLet be arbitraryChoose: chooseFTest: v ⋅Ifx > 0x∈ x ∈ and F+ go to v⋅ x ≤ Choose 0 x∈ F + If and go to v⋅ x ≤ 0 Fix plus x∈ F − vIf > 0 ⋅x x ∈ and F− go to v := v + x Choose If and go to v := v − x Fix minusFix plus: go to ChooseFix minus: go to Choose Lecture 1 53
  • 54. Perceptron Algorithm Conditions to the algorithm existence : Condition no.1: +    if x ∈ F v * ⋅ x > 0 ∃ v∗ ,   if x ∈ F v * ⋅ x ≤ 0 −  Condition no.2: We choose F to be a group of unit vectors. Lecture 1 54
  • 55. Geometric viewpoint wn + 1 x v * wn Lecture 1 55
  • 56. Perceptron Algorithm Based on these conditions the number of times we enter the Loop is finite. Examples Proof: world + − F = F F { x A x > 0} { y A y ≤ 0} * * Positiv Negati e ve exampl examp es les Lecture 1 56
  • 57. Perceptron Algorithm-Proof We replace the threshold with a bias. We assume F is a group of unit vectors. ∀φ ∈ F φ =1 Lecture 1 57
  • 58. Perceptron Algorithm-Proof We reduce ∑ ay≤0 ∑ a (− y ) ≥ 0 what we have * i i ⇒ * i i to prove by yi ⇒ -yi eliminating all + the negative F= F examples and placing their ∀φ ∈ F A*φ > 0 negations in the positive ∀φ ∈ F A*φ > δ examples. ⇓ ∃δ > 0 Aφ > δ * Lecture 1 58
  • 59. Perceptron Algorithm- Proof A* ⋅ A A* ⋅ A G ( A) = * = ≤1 A ⋅A AThenumerator : A ⋅ Al + 1 = A ⋅ ( Al + φ ) = A ⋅ Al + A ⋅ φ ≥ A ⋅ Al + δ * * * * * >δ After n changes A* ⋅ An ≥ n ⋅ δ Lecture 1 59
  • 60. Perceptron Algorithm- ProofThedenominator: At + 1 = ( At + 1 )( At + 1 ) = ( At + φ )( At + φ ) = 2 2 2 2 = At + 2 ⋅ At ⋅ φ + φ < At + 1 <0 =1 After n 2 changes An < n Lecture 1 60
  • 61. Perceptron Algorithm- ProofFrom thedenominator: 2 An < nFrom thenumerator : A* ⋅ An > n ⋅ δ A ⋅ An n ⋅ δ * G ( A) = > = n ⋅δ ≤ 1 An n ⇓ n is 1 final n< δ 2 Lecture 1 61
  • 62. Example - AND x1 x2 bias ANDw0 = 0,0,0 0w0 ⋅ x 0 = w0 ⋅ 0,0,1 = 0w0 ⋅ x1 = w0 ⋅ 0,1,1 = 0 0 0 1 0w0 ⋅ x 2 = w0 ⋅ 1,0,1 = 0  wrong 0 1 1 0w0 ⋅ x3 = w0 ⋅ 1,1,1 = 0FIX = w1 = w0 + x3 = 0,0,0 + 1,1,1 = 1,1,1 1 0 1 1 w1 ⋅ x 0 = w1 ⋅ 0,0,1 = 1FIX = w2 = w1 − x 0 = 1,1,1 − 0,0,1wrong  = 1,1,0 1 1 1 w2 ⋅ x1 = w2 ⋅ 0,1,1 = 1FIX = w3 = w2 − x1 = 1,1,0 − 0,1,1 = 1,0,−1  wrong …etc Lecture 1 62
  • 63. AND – Bi Polar- solution - 1 0 - -1 1 1 0 1 1 - -1 1 0 1 1 0 1 1- - 1 1 0 1 1 01 11 1 1 0  wrong+ 1 1 1 1- - 1 1 1 1 01 1 -1 1 1 1  wrong - success- 0 2 0 1 1 11  wrong - 1 1 -1 1 1 1 1 continue Lecture 1 63
  • 64. Problem LHS1 MHS1 RHS1 + MHS2 MHS 2RHS < LHSLHS 3 + 3MHS 33 + RHS 3 > 0 LHS 2 LHS + RHS 2 0 3 MHS RHS 2 2 LHS1 + MHS1 + RHS1 > 0 MHS1 = MHS 2 = MHS 3LHS1 + RHS1 > 0 LHS 2 + RHS 2 < 0 LHS 3 + RHS3 > 0 LHS1 = LHS 2 RHS 2 = RHS 3 should be small RHS 2 should be small LHS 2 enough so that enough so that LHS 2 + RHS 2 < 0 LHS 2 + RHS 2 < 0 contradictio !!!n Lecture 1 64
  • 65. Linear Separation Every perceptron determines a classification of vector inputs which is determined by a hyperline Two dimensional examples (add algebra) OR AND XOR not possible Lecture 1 65
  • 66. Linear Separation inHigher DimensionsIn higher dimensions, still linear separation, but hard to tellExample: Connected; Convex - which can be handled by Perceptron with local sensors; which can not be.Note: Define local sensors. Lecture 1 66
  • 67. What wont work? Try XOR. Lecture 1 67
  • 68. Limitations ofPerceptron Representability  Only concepts that are linearly separable.  Compare: Convex versus connected  Examples: XOR vs OR Lecture 1 68