Successfully reported this slideshow.
Your SlideShare is downloading. ×

Artificial Neural Network

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 32 Ad

Artificial Neural Network

Download to read offline

An artificial neuron network (neural network) is a computational model that mimics the way nerve cells work in the human brain. Artificial neural networks (ANNs) use learning algorithms that can independently make adjustments - or learn, in a sense - as they receive new input

An artificial neuron network (neural network) is a computational model that mimics the way nerve cells work in the human brain. Artificial neural networks (ANNs) use learning algorithms that can independently make adjustments - or learn, in a sense - as they receive new input

Advertisement
Advertisement

More Related Content

Recently uploaded (20)

Advertisement

Artificial Neural Network

  1. 1. Artificial Neural Networks Lecture 1: Introduction, Biological motivations, Perceptron networks, Perceptron learning rule
  2. 2. • traditional knowledge-based systems (e.g. expert systems) • neural networks • fuzzy systems • evolutionary algorithms • statistical and Bayesian ML • multi-agent systems • distributed artificial intelligence Main approaches in modern Artificial Intelligence Computational Intelligence (or Soft Computing) Hybrid Intelligent Systems
  3. 3. • Biological foundations of neural networks. Short introduction to neurons, synapses, and the concept of learning • Perceptron networks • Linear separability • Perceptron learning rule • Types of neural networks. History of artificial neural networks Lecture Outline
  4. 4. What is connectionism/neural networks? A simplified model of how natural neural systems work. Neural Networks (NNs) simulate natural information processing tasks from human brain. A NN model consists of neurons and connections between neurons (synapses). Characteristics of Human Brain: ◦ Contains 1011 neurons and 1015 connections ◦ Each neuron may connect to other 10,000 neurons. ◦ Human can perform a task of picture naming in about 500 milisec.
  5. 5. Biological Neural Networks Soma Soma Synapse Synapse Dendrites Axon Synapse Dendrites Axon (Picture from M. Negnevitsky, Pearson Education, 2002) Dendrites bring information to the cell body and axons take information away from the cell body. Soma or cell body is the bulbous, non-process portion of a neuron or other brain cell type, containing the cell nucleus.
  6. 6. What is an Artificial Neural Network? . . . . . outputs 1 2 m 1 2 n ) ( f y b x w n 1 i i i        inputs f f:Rm  Rn
  7. 7. Artificial Neural Networks (ANNs): An Artificial Neural Network is an interconnected assembly of simple processing elements, units or nodes (neurons), whose functionality is inspired by the functioning of the natural neuron from brain. The processing ability of the neural network lies in the inter-unit connection strengths, or weights, obtained by a process of learning from a set of training patterns.
  8. 8. Artificial Neural Nets (ANNs): The units (individual neurons) operate only locally on the inputs they receive via connections. ANNs undergo some sort of "training" whereby the connection weights are adjusted on the basis of presented data. In other words, ANNs "learn" from examples (as children learn to recognize dogs from examples of dogs) and exhibit some generalization capability beyond the training data (for other data than those included in the training set).
  9. 9. Formal neuron model (McCullochandWalterPittsin1943) ) ( f y b x w n 1 i i i        f Asingleneuron has 6 components: 1. Input x 2. Weights w 3. Bias b (Threshold = -b) 4. Activation function f 5. Input function σ 6. Output y y = f ( 𝚺 xi wi + b ) McCulloch & Pitts (1943) recognised as the designers of the first neuron (and neural network) model
  10. 10. Stept(x) = 1 if x >= t, else 0 Sign(x) = +1 if x >= 0, else –1 Sigmoid(x) = 1/(1+e-k*x) Identity Function: Id(x) = x Linear function: Lin(x) = k*x Formal neuron - Activation Functions
  11. 11. A neuron which implements the OR function: OR X1 X2 Y 1 1 1 1 0 1 0 1 1 0 0 0 Threshold = 1 Bias = -1 (Threshold = - Bias) 1.5 1.5 Y X1 X2 Using the McCulloch-Pitts neurons it is possible to implement the basic logic gates (e.g. AND, OR, NOT). Also, it is well known that any logical function can be constructed based on the three basic logic gates. Only accepts binary inputs. Can be considered as classification of the binary outputs Y, using the binary inputs, using a linear decision boundary. X1 X2
  12. 12. Perceptron Model (Frank Rosenblatt in 1957) • Perceptron is a Single-Layer, Feed- Forward Network • The inputs are real values • The input connections are associated with weights. • The weighted sum of the input neurons and a threshold can be used to calculate the output for decision making . • Only linear decision surface is learned. ( from G. Kendall, Lect. Notes Univ. of Nottingham)
  13. 13. What can Perceptrons learn? 0,0 0,1 1,0 1,1 0,0 0,1 1,0 1,1 AND XOR • Functions that can be separated in this way are called Linearly Separable (XOR is not Linearly Separable) • A perceptron can learn (represent) only Linearly Separable functions. (e.g. OR, AND, NOT)
  14. 14. XOR problem – not linearly separable One neuron layer is not enough, we should introduce an intermediate (hidden) layer. Y = X1 XOR X2 = (X2 AND NOT X1) OR (X1 AND NOT X2) Threshold for all nodes = 1.5 XOR X1 X2 Y 1 1 0 1 0 1 0 1 1 0 0 0 X1 X2 Y 2 -1 -1 2 2 2
  15. 15. What can Perceptrons represent? Linear Separability is also possible in more than 3 dimensions – but it is harder to visualize ( from G. Kendall, Lect. Notes Univ. of Nottingham) Take a 3-input neuron: Linear Separability is also possible in more than 3 dimensions – but it is harder to visualize
  16. 16. Training a Perceptron: p = 4 Training set = { ((1,1),1), ((1,0),0), ((0,1),0), ((0,0),0) } The training technique is called Perceptron Learning Rule AND X1 X2 D 1 1 1 1 0 0 0 1 0 0 0 0 Training Dataset: { (x(i), d(i)), i=1,…,p} p - number of training examples
  17. 17. Perceptron Learning 0,0 0,1 1,0 1,1 I1 I2 After weight initialization (First Epoch) 0,0 0,1 1,0 1,1 I1 I2 At Convergence Separation line w1X1 + w2X2 + b = 0 w1X1 + w2X2 + b > 0 w1X1 + w2X2 + b < 0 X2 X1 X2 X1
  18. 18. Vectors from the training set are presented to the Perceptron network one after another (cyclic or randomly): (x(1), d(1)), (x(2), d(2)),…, (x(p), d(p)), (x(p+1), d(p+1)),… • If the network's output is correct, no change is made. Otherwise, the weights and biases are updated using the Perceptron Learning Rule. • An entire pass through all of the input training vectors is called an Epoch. • When such an entire pass of the training set has occurred without error, training is complete Training a perceptron (main idea):
  19. 19. The Perceptron learning algorithm: 1. Initialize the weights and threshold to small random numbers. 2. At time step t present a vector to the neuron inputs and calculate the perceptron output y(t). 3. Update the weights and biases as follows: wj(t+1) = wj(t) + η (d(t)-y(t))xj b(t+1) = b(t) + η (d(t)-y(t)) d(t) is the desired output, y(t) is the computed output, t is the step/iteration number, and η is the gain or step size (Learning Rate), where 0.0 < η <= 1.0 4. Repeat steps 2 and 3 until: the iteration error is less than a user-specified error threshold or a predetermined number of iterations have been completed. (The perceptron learning algorithm developed originally by F. Rosenblatt in the late 1950s)
  20. 20. Training a Perceptron – Example AND function Initial weights: 0.4; 0.6 threshold =0.5 (bias=-0.5) Learning rate: η = 0.1 Input 1 Input 2 Target Output Weight 1 Weight 2 Bias 0.4 0.6 -0.5
  21. 21. Input 1 Input 2 Target Output Weight 1 Weight 2 Bias 1 1 1 1 0.4 0.6 -0.5 Training a Perceptron – Example AND function Initial weights: 0.4; 0.6 threshold =0.5 (bias=-0.5) Learning rate: η = 0.1
  22. 22. Input 1 Input 2 Target Output Weight 1 Weight 2 Bias 1 1 1 1 0.4 0.6 -0.5 0.4 0.6 -0.5 Training a Perceptron – Example AND function Initial weights: 0.4; 0.6 threshold =0.5 (bias=-0.5) Learning rate: η = 0.1
  23. 23. Input 1 Input 2 Target Output Weight 1 Weight 2 Bias 1 1 1 1 0.4 0.6 -0.5 1 0 0 0 0.4 0.6 -0.5 Training a Perceptron – Example AND function Initial weights: 0.4; 0.6 threshold =0.5 (bias=-0.5) Learning rate: η = 0.1
  24. 24. Input 1 Input 2 Target Output Weight 1 Weight 2 Bias 1 1 1 1 0.4 0.6 -0.5 1 0 0 0 0.4 0.6 -0.5 0.4 0.6 -0.5 Training a Perceptron – Example AND function Initial weights: 0.4; 0.6 threshold =0.5 (bias=-0.5) Learning rate: η = 0.1
  25. 25. Input 1 Input 2 Target Output Weight 1 Weight 2 Bias 1 1 1 1 0.4 0.6 -0.5 1 0 0 0 0.4 0.6 -0.5 0 1 0 1 0.4 0.6 -0.5 Training a Perceptron – Example AND function Initial weights: 0.4; 0.6 threshold =0.5 (bias=-0.5) Learning rate: η = 0.1
  26. 26. Input 1 Input 2 Target Output Weight 1 Weight 2 Bias 1 1 1 1 0.4 0.6 -0.5 1 0 0 0 0.4 0.6 -0.5 0 1 0 1 0.4 0.6 -0.5 0.4 0.5 -0.6 Training a Perceptron – Example AND function Initial weights: 0.4; 0.6 threshold =0.5 (bias=-0.5) Learning rate: η = 0.1
  27. 27. Input 1 Input 2 Target Output Weight 1 Weight 2 Bias 1 1 1 1 0.4 0.6 -0.5 1 0 0 0 0.4 0.6 -0.5 0 1 0 1 0.4 0.6 -0.5 0 0 0 0 0.4 0.5 -0.6 1 1 1 1 0.4 0.5 -0.6 1 0 0 0 0.4 0.5 -0.6 0 1 0 0 0.4 0.5 -0.6 Training a Perceptron – Example AND function Initial weights: 0.4; 0.6 threshold =0.5 (bias=-0.5) Learning rate: η = 0.1
  28. 28. Training a Perceptron: •Learning only occurs when an error is made, otherwise the weights are left unchanged!!. •During training, it is useful to measure the performance of the network as it attempts to find the optimal weight set. •A common error measure used is sum-squared errors (computed over all of the input vector / output vector pairs in the training set): where p is the number of input/output vector pairs in the training set.  - Learning rate - Dictates how quickly the network converges. It is set by a matter of experimentation (usually small – e.g. 0.1)     p 1 i 2 i i ) t ( y ) t ( d 2 1 E
  29. 29. Major classes of Neural Networks: • Backpropagation Neural Networks (supervised learning) • Kohonen Self-Organizing Maps (unsupervised learning) • Hopfield Neural Networks (recurrent neural network) • Radial Basis Function Neural Networks (RBF) • Neuro-Fuzzy Networks (NF) • Others: various architectures of Recurrent Neural Networks, networks with dynamic neurons, networks with competitive learning, etc. •More recent improvements: convolutional neural networks, spiking neural networks, deep learning architectures
  30. 30. History of Neural Networks McCulloch & Pitts (1943) ----- neural networks and artificial intelligence were born, first well-known model for a biological neuron Hebb(1949) ------ Hebb learning rule Minsky(1954) ------ Neural Networks (PhD Thesis) Rosenblatt(1957) ------ Perceptron networks (Perceptron learning rule) Widrow and Hoff(1959) ------ Delta rule for ADALINE networks Minsky & Papert(1969) ------ Criticism on Perceptron networks (problem of linear separability) Kohonen(1982) ------- Self-Organizing Maps Hopfield(1982) ------- Hopfield Networks Rumelhart, Hinton ------- Back-Propagation algorithm & Williams (1986) Broomhead & Lowe (1988) ------ Radial Basis Functions networks (RBF) Vapnik (1990) ------ Support Vector Machine approach In the ’90s - massive interest in neural networks, many NN applications were developed, and after Neuro-Fuzzy networks emerged, spiking NNs, Deep NN architectures, etc.
  31. 31. Neural Networks: •Can learn directly from data (good learning ability – better than other AI approaches) •Can learn from noisy or corrupted data •Parallel information processing •Computationally fast once trained •Robustness to partial failure of the network •Useful where data are available and difficult to acquire symbolic knowledge Drawbacks of NNs: • Knowledge captured by a NN through learning (in weights –real numbers) is not in a familiar form for human beings, e.g. if-then rules (NNs are black box structures), difficult to interpret the model. •Long training time
  32. 32. Thank you

×