SlideShare a Scribd company logo
An Introduction to Neural Networks

                              Vincent Cheung
                              Kevin Cannons
                      Signal & Data Compression Laboratory
                        Electrical & Computer Engineering
                             University of Manitoba
                          Winnipeg, Manitoba, Canada
                            Advisor: Dr. W. Kinsner

May 27, 2002
Neural Networks


         ● Fundamentals
         ● Classes
         ● Design and Verification
         ● Results and Discussion
         ● Conclusion

Cheung/Cannons                                    1
Fundamentals                                                                Neural Networks

               What Are Artificial Neural Networks?

               ● An extremely simplified model of the brain

               ● Essentially a function approximator
                  ►   Transforms inputs into outputs to the best of its ability

                                 Inputs                    Outputs

                                   Inputs     NN         Outputs

Cheung/Cannons                                                                           2
Fundamentals                                                  Neural Networks

               What Are Artificial Neural Networks?

               ● Composed of many “neurons” that co-operate

                 to perform the desired function

Cheung/Cannons                                                             3
Fundamentals                                                           Neural Networks

               What Are They Used For?

               ● Classification

                  ►   Pattern recognition, feature extraction, image

               ● Noise Reduction
                  ►   Recognize patterns in the inputs and produce

                      noiseless outputs

               ● Prediction
                  ►   Extrapolation based on historical data

Cheung/Cannons                                                                      4
Fundamentals                                                              Neural Networks

               Why Use Neural Networks?

               ● Ability to learn

                  ►   NN’s figure out how to perform their function on their own
                  ►   Determine their function based only upon sample inputs

               ● Ability to generalize
                      i.e. produce reasonable outputs for inputs it has not been

                      taught how to deal with

Cheung/Cannons                                                                         5
Fundamentals                                                                   Neural Networks

               How Do Neural Networks Work?

               ● The output of a neuron is a function of the

                 weighted sum of the inputs plus a bias
                         i1 w1
                             w3 Neuron     Output = f(i1w1 + i2w2 + i3w3 + bias)



               ● The function of the entire neural network is simply
                 the computation of the outputs of all the neurons
                  ►   An entirely deterministic calculation

Cheung/Cannons                                                                              6
Fundamentals                                                                 Neural Networks

               Activation Functions

               ● Applied to the weighted sum of the inputs of a

                 neuron to produce the output
               ● Majority of NN’s use sigmoid functions
                  ►   Smooth, continuous, and monotonically increasing
                      (derivative is always positive)

                  ►   Bounded range - but never reaches max or min
                       ■ Consider “ON” to be slightly less than the max and “OFF” to
                         be slightly greater than the min

Cheung/Cannons                                                                            7
Fundamentals                                                               Neural Networks

               Activation Functions

               ● The most common sigmoid function used is the

                 logistic function
                  ►   f(x) = 1/(1 + e-x)
                  ►   The calculation of derivatives are important for neural
                      networks and the logistic function has a very nice

                       ■ f’(x) = f(x)(1 - f(x))

               ● Other sigmoid functions also used
                      hyperbolic tangent

                  ►   arctangent

               ● The exact nature of the function has little effect on
                 the abilities of the neural network

Cheung/Cannons                                                                          8
Fundamentals                                                                      Neural Networks

               Where Do The Weights Come From?

               ● The weights in a neural network are the most

                 important factor in determining its function
               ● Training is the act of presenting the network with
                 some sample data and modifying the weights to
                 better approximate the desired function

               ● There are two main types of training
                  ►   Supervised Training
                       ■ Supplies the neural network with inputs and the desired

                       ■ Response of the network to the inputs is measured
                             The weights are modified to reduce the difference between
                             the actual and desired outputs

Cheung/Cannons                                                                                 9
Fundamentals                                                                       Neural Networks

               Where Do The Weights Come From?

                 ►   Unsupervised Training

                      ■ Only supplies inputs
                      ■ The neural network adjusts its own weights so that similar
                        inputs cause similar outputs
                            The network identifies the patterns and differences in the
                            inputs without any external assistance

               ● Epoch
                      ■ One iteration through the process of providing the network
                        with an input and updating the network's weights
                      ■ Typically many epochs are required to train the neural


Cheung/Cannons                                                                                 10
Fundamentals                                                                   Neural Networks


               ● First neural network with the ability to learn

               ● Made up of only input neurons and output neurons
               ● Input neurons typically have two states: ON and OFF
               ● Output neurons use a simple threshold activation function

               ● In basic form, can only solve linear problems
                   ►   Limited applications



                                     Input Neurons   Weights   Output Neuron

Cheung/Cannons                                                                             11
Fundamentals                                                                                       Neural Networks

               How Do Perceptrons Learn?

               ● Uses supervised training

               ● If the output is not correct, the weights are
                 adjusted according to the formula:
                         ■ wnew = wold + α(desired – output)*input                    α is the learning rate

                              1                  0.5

                              0                                          1

                              1                      0.8

                                                           Assume Output was supposed to be 0
                                                               update the weights
                 1 * 0.5 + 0 * 0.2 + 1 * 0.8 = 1.3
                 Assuming Output Threshold = 1.2           Assume α = 1
                  1.3 > 1.2                                W 1new = 0.5 + 1*(0-1)*1 = -0.5
                                                           W 2new = 0.2 + 1*(0-1)*0 = 0.2
                                                           W 3new = 0.8 + 1*(0-1)*1 = -0.2

Cheung/Cannons                                                                                                 12
Fundamentals                                                                 Neural Networks

               Multilayer Feedforward Networks

               ● Most common neural network

               ● An extension of the perceptron
                  ►   Multiple layers
                       ■ The addition of one or more “hidden” layers in between the
                         input and output layers

                  ►   Activation function is not simply a threshold
                       ■ Usually a sigmoid function
                  ►   A general function approximator
                       ■ Not limited to linear problems

               ● Information flows in one direction
                  ►   The outputs of one layer act as inputs to the next layer

Cheung/Cannons                                                                           13
Fundamentals                                                                                    Neural Networks

               XOR Example




               Inputs: 0, 1

               H1: Net = 0(4.83) + 1(-4.83) – 2.82 = -7.65    H2: Net = 0(-4.63) + 1(4.6) – 2.74 = 1.86
                   Output = 1 / (1 + e7.65) = 4.758 x 10-4        Output = 1 / (1 + e-1.86) = 0.8652

               O: Net = 4.758 x 10-4(5.73) + 0.8652(5.83) – 2.86 = 2.187
                  Output = 1 / (1 + e-2.187) = 0.8991 ≡ “1”

Cheung/Cannons                                                                                              14
Fundamentals                                                Neural Networks


               ● Most common method of obtaining the many

                 weights in the network
               ● A form of supervised training
               ● The basic backpropagation algorithm is based on

                 minimizing the error of the network using the
                 derivatives of the error function
                  ►   Simple
                  ►   Slow

                  ►   Prone to local minima issues

Cheung/Cannons                                                          15
Fundamentals                                                                       Neural Networks


               ● Most common measure of error is the mean

                 square error:
                       E = (target – output)2

               ● Partial derivatives of the error wrt the weights:

                  ►   Output Neurons:
                       let: δj = f’(netj) (targetj – outputj)   j = output neuron
                       ∂E/∂wji = -outputi δj                    i = neuron in last hidden

                  ►   Hidden Neurons:
                                                                j = hidden neuron
                       let: δj = f’(netj) Σ(δkwkj)
                                                                i = neuron in previous layer
                       ∂E/∂wji = -outputi δj                    k = neuron in next layer

Cheung/Cannons                                                                                 16
Fundamentals                                                       Neural Networks


               ● Calculation of the derivatives flows backwards

                 through the network, hence the name,
               ● These derivatives point in the direction of the
                 maximum increase of the error function

               ● A small step (learning rate) in the opposite
                 direction will result in the maximum decrease of
                 the (local) error function:

                      wnew = wold – α ∂E/∂wold
                      where α is the learning rate

Cheung/Cannons                                                                 17
Fundamentals                                                                      Neural Networks


               ● The learning rate is important

                  ►   Too small
                       ■ Convergence extremely slow
                  ►   Too large
                       ■ May not converge

               ● Momentum
                  ►   Tends to aid convergence
                  ►   Applies smoothed averaging to the change in weights:

                       ∆new = β∆old - α ∂E/∂wold           β is the momentum coefficient

                       wnew = wold + ∆new
                  ►   Acts as a low-pass filter by reducing rapid fluctuations

Cheung/Cannons                                                                                18
Fundamentals                                                                        Neural Networks

               Local Minima

               ● Training is essentially minimizing the mean square

                 error function
                  ►   Key problem is avoiding local minima
                  ►   Traditional techniques for avoiding local minima:
                       ■ Simulated annealing

                             Perturb the weights in progressively smaller amounts
                       ■ Genetic algorithms
                             Use the weights as chromosomes
                             Apply natural selection, mating, and mutations to these

Cheung/Cannons                                                                                  19
Fundamentals                                                                     Neural Networks

               Counterpropagation (CP) Networks

               ● Another multilayer feedforward network

               ● Up to 100 times faster than backpropagation
               ● Not as general as backpropagation
               ● Made up of three layers:

                  ►   Input
                  ►   Kohonen
                  ►   Grossberg (Output)

                                            Inputs   Input   Kohonen   Grossberg Outputs
                                                     Layer   Layer     Layer

Cheung/Cannons                                                                               20
Fundamentals                                                           Neural Networks

               How Do They Work?

               ● Kohonen Layer:

                  ►   Neurons in the Kohonen layer sum all of the weighted
                      inputs received
                  ►   The neuron with the largest sum outputs a 1 and the
                      other neurons output 0

               ● Grossberg Layer:
                  ►   Each Grossberg neuron merely outputs the weight of the
                      connection between itself and the one active Kohonen

Cheung/Cannons                                                                     21
Fundamentals                                                                Neural Networks

               Why Two Different Types of Layers?

               ● More accurate representation of biological neural

               ● Each layer has its own distinct purpose:
                  ►   Kohonen layer separates inputs into separate classes
                       ■ Inputs in the same class will turn on the same Kohonen

                  ►   Grossberg layer adjusts weights to obtain acceptable
                      outputs for each class

Cheung/Cannons                                                                          22
Fundamentals                                                                Neural Networks

               Training a CP Network

               ● Training the Kohonen layer

                  ►   Uses unsupervised training
                  ►   Input vectors are often normalized
                  ►   The one active Kohonen neuron updates its weights
                      according to the formula:

                       wnew = wold + α(input - wold)
                       where α is the learning rate

                       ■ The weights of the connections are being modified to more
                         closely match the values of the inputs
                       ■ At the end of training, the weights will approximate the
                         average value of the inputs in that class

Cheung/Cannons                                                                          23
Fundamentals                                                               Neural Networks

               Training a CP Network

               ● Training the Grossberg layer

                  ►   Uses supervised training
                  ►   Weight update algorithm is similar to that used in

Cheung/Cannons                                                                         24
Fundamentals                                                                      Neural Networks

               Hidden Layers and Neurons

               ● For most problems, one layer is sufficient

               ● Two layers are required when the function is
               ● The number of neurons is very important:

                  ►   Too few
                       ■ Underfit the data – NN can’t learn the details
                  ►   Too many
                       ■ Overfit the data – NN learns the insignificant details

                  ►   Start small and increase the number until satisfactory
                      results are obtained

Cheung/Cannons                                                                                25
Fundamentals                 Neural Networks


                                W ell fit

Cheung/Cannons                              26
Fundamentals                                                           Neural Networks

               How is the Training Set Chosen?

               ● Overfitting can also occur if a “good” training set is

                 not chosen
               ● What constitutes a “good” training set?
                  ►   Samples must represent the general population
                      Samples must contain members of each class

                  ►   Samples in each class must contain a wide range of
                      variations or noise effect

Cheung/Cannons                                                                     27
Fundamentals                                                                   Neural Networks

               Size of the Training Set

               ● The size of the training set is related to the

                 number of hidden neurons
                  ►   Eg. 10 inputs, 5 hidden neurons, 2 outputs:
                  ►   11(5) + 6(2) = 67 weights (variables)
                  ►   If only 10 training samples are used to determine these

                      weights, the network will end up being overfit
                       ■ Any solution found will be specific to the 10 training
                       ■ Analogous to having 10 equations, 67 unknowns          you

                         can come up with a specific solution, but you can’t find the
                         general solution with the given information

Cheung/Cannons                                                                             28
Fundamentals                                                                   Neural Networks

               Training and Verification

               ● The set of all known samples is broken into two

                 orthogonal (independent) sets:
                  ►   Training set
                       ■ A group of samples used to train the neural network
                  ►   Testing set

                       ■ A group of samples used to test the performance of the
                         neural network
                       ■ Used to estimate the error rate

                                       Known Samples

                                     Training    Testing
                                       Set        Set

Cheung/Cannons                                                                             29
Fundamentals                                                                  Neural Networks


               ● Provides an unbiased test of the quality of the

               ● Common error is to “test” the neural network using
                 the same samples that were used to train the
                 neural network

                  ►   The network was optimized on these samples, and will
                      obviously perform well on them
                  ►   Doesn’t give any indication as to how well the network
                      will be able to classify inputs that weren’t in the training


Cheung/Cannons                                                                            30
Fundamentals                                                              Neural Networks


               ● Various metrics can be used to grade the

                 performance of the neural network based upon the
                 results of the testing set
                  ►   Mean square error, SNR, etc.

               ● Resampling is an alternative method of estimating

                 error rate of the neural network
                  ►   Basic idea is to iterate the training and testing
                      procedures multiple times

                  ►   Two main techniques are used:
                       ■ Cross-Validation
                       ■ Bootstrapping

Cheung/Cannons                                                                        31
Fundamentals                                                              Neural Networks

               Results and Discussion

               ● A simple toy problem was used to test the
                 operation of a perceptron

               ● Provided the perceptron with 5 pieces of
                 information about a face – the individual’s hair,
                 eye, nose, mouth, and ear type

                  ►   Each piece of information could take a value of +1 or -1
                       ■ +1 indicates a “girl” feature
                       ■ -1 indicates a “guy” feature

               ● The individual was to be classified as a girl if the

                 face had more “girl” features than “guy” features
                 and a boy otherwise

Cheung/Cannons                                                                        32
Fundamentals                                                               Neural Networks

               Results and Discussion

               ● Constructed a perceptron with 5 inputs and 1


                                Face      Input    Output   Output value
                               Feature   neurons   neuron    indicating
                                Input                        boy or girl

               ● Trained the perceptron with 24 out of the 32
                 possible inputs over 1000 epochs
               ● The perceptron was able to classify the faces that
                 were not in the training set
Cheung/Cannons                                                                         33
Fundamentals                                                                       Neural Networks

               Results and Discussion

               ● A number of toy problems were tested on
                 multilayer feedforward NN’s with a single hidden

                 layer and backpropagation:
                  ►   Inverter
                       ■ The NN was trained to simply output 0.1 when given a “1”
                         and 0.9 when given a “0”

                                 A demonstration of the NN’s ability to memorize
                       ■ 1 input, 1 hidden neuron, 1 output
                       ■ With learning rate of 0.5 and no momentum, it took about
                         3,500 epochs for sufficient training
                       ■ Including a momentum coefficient of 0.9 reduced the

                         number of epochs required to about 250

Cheung/Cannons                                                                                 34
Fundamentals                                                                      Neural Networks

               Results and Discussion

                 ►   Inverter (continued)

                      ■ Increasing the learning rate decreased the training time
                        without hampering convergence for this simple example
                      ■ Increasing the epoch size, the number of samples per
                        epoch, decreased the number of epochs required and
                        seemed to aid in convergence (reduced fluctuations)

                      ■ Increasing the number of hidden neurons decreased the
                        number of epochs required
                            Allowed the NN to better memorize the training set – the goal
                            of this toy problem
                            Not recommended to use in “real” problems, since the NN

                            loses its ability to generalize

Cheung/Cannons                                                                                35
Fundamentals                                                           Neural Networks

               Results and Discussion

                 ►   AND gate

                      ■ 2 inputs, 2 hidden neurons, 1 output
                      ■ About 2,500 epochs were required when using momentum
                 ►   XOR gate
                      ■ Same as AND gate

                 ►   3-to-8 decoder
                      ■ 3 inputs, 3 hidden neurons, 8 outputs
                      ■ About 5,000 epochs were required when using momentum

Cheung/Cannons                                                                     36
Fundamentals                                                                      Neural Networks

               Results and Discussion

                 ►   Absolute sine function approximator (|sin(x)|)

                      ■ A demonstration of the NN’s ability to learn the desired
                        function, |sin(x)|, and to generalize
                      ■ 1 input, 5 hidden neurons, 1 output
                      ■ The NN was trained with samples between –π/2 and π/2
                            The inputs were rounded to one decimal place

                            The desired targets were scaled to between 0.1 and 0.9
                      ■ The test data contained samples in between the training
                        samples (i.e. more than 1 decimal place)
                            The outputs were translated back to between 0 and 1

                      ■ About 50,000 epochs required with momentum
                      ■ Not smooth function at 0 (only piece-wise continuous)

Cheung/Cannons                                                                                37
Fundamentals                                                                 Neural Networks

               Results and Discussion

                 ►   Gaussian function approximator (e-x )

                      ■ 1 input, 2 hidden neurons, 1 output
                      ■ Similar to the absolute sine function approximator, except
                        that the domain was changed to between -3 and 3
                      ■ About 10,000 epochs were required with momentum
                      ■ Smooth function

Cheung/Cannons                                                                           38
Fundamentals                                                                      Neural Networks

               Results and Discussion

                 ►   Primality tester

                      ■ 7 inputs, 8 hidden neurons, 1 output
                      ■ The input to the NN was a binary number
                      ■ The NN was trained to output 0.9 if the number was prime
                        and 0.1 if the number was composite
                             Classification and memorization test

                      ■ The inputs were restricted to between 0 and 100
                      ■ About 50,000 epochs required for the NN to memorize the
                        classifications for the training set
                             No attempts at generalization were made due to the

                             complexity of the pattern of prime numbers
                      ■ Some issues with local minima

Cheung/Cannons                                                                                39
Fundamentals                                                                     Neural Networks

               Results and Discussion

                 ►   Prime number generator
                      ■ Provide the network with a seed, and a prime number of the

                        same order should be returned
                      ■ 7 inputs, 4 hidden neurons, 7 outputs
                      ■ Both the input and outputs were binary numbers
                      ■ The network was trained as an autoassociative network

                            Prime numbers from 0 to 100 were presented to the network
                            and it was requested that the network echo the prime
                            The intent was to have the network output the closest prime
                            number when given a composite number
                      ■ After one million epochs, the network was successfully able

                        to produce prime numbers for about 85 - 90% of the
                        numbers between 0 and 100
                      ■ Using Gray code instead of binary did not improve results
                      ■ Perhaps needs a second hidden layer, or implement some
                        heuristics to reduce local minima issues

Cheung/Cannons                                                                               40
Neural Networks


         ● The toy examples confirmed the basic operation of
           neural networks and also demonstrated their
           ability to learn the desired function and generalize
           when needed
         ● The ability of neural networks to learn and
           generalize in addition to their wide range of
           applicability makes them very powerful tools

Cheung/Cannons                                                         41
Neural Networks

                 Questions and Comments

Cheung/Cannons                                        42
Neural Networks


         ● Natural Sciences and Engineering Research
           Council (NSERC)

         ● University of Manitoba

Cheung/Cannons                                                     43
Neural Networks


         [AbDo99] H. Abdi, D. Valentin, B. Edelman, Neural Networks, Thousand Oaks, CA: SAGE Publication
             Inc., 1999.

         [Hayk94] S. Haykin, Neural Networks, New York, NY: Nacmillan College Publishing Company, Inc., 1994.

         [Mast93] T. Masters, Practial Neural Network Recipes in C++, Toronto, ON: Academic Press, Inc., 1993.

         [Scha97] R. Schalkoff, Artificial Neural Networks, Toronto, ON: the McGraw-Hill Companies, Inc., 1997.

         [WeKu91] S. M. Weiss and C. A. Kulikowski, Computer Systems That Learn, San Mateo, CA: Morgan
             Kaufmann Publishers, Inc., 1991.

         [Wass89] P. D. Wasserman, Neural Computing: Theory and Practice, New York, NY: Van Nostrand
             Reinhold, 1989.

Cheung/Cannons                                                                                                      44

More Related Content

What's hot

Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
Pratik Aggarwal
Neural network
Neural networkNeural network
Neural network
Artifical Neural Network
Artifical Neural NetworkArtifical Neural Network
Artifical Neural Network
(Artificial) Neural Network
(Artificial) Neural Network(Artificial) Neural Network
(Artificial) Neural Network
Putri Wikie
Artificial Neural Networks for NIU
Artificial Neural Networks for NIUArtificial Neural Networks for NIU
Artificial Neural Networks for NIU
Prof. Neeta Awasthy
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17
Prof. Neeta Awasthy
Neural network
Neural networkNeural network
Neural network
Artificial Neuron network
Artificial Neuron network Artificial Neuron network
Artificial Neuron network
Smruti Ranjan Sahoo
Artificial nueral network slideshare
Artificial nueral network slideshareArtificial nueral network slideshare
Artificial nueral network slideshare
Red Innovators
Neural network and mlp
Neural network and mlpNeural network and mlp
Neural network and mlp
partha pratim deb
Neural network
Neural networkNeural network
Neural network
Ramesh Giri
neural-networks (1)
neural-networks (1)neural-networks (1)
neural-networks (1)
Neural Networks
Neural NetworksNeural Networks
Neural Networks
R A Akerkar
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
Atul Krishna
Neural networks
Neural networksNeural networks
Neural networks
Application of artificial_neural_network
Application of artificial_neural_networkApplication of artificial_neural_network
Application of artificial_neural_network
gabo GAG
Neural network
Neural networkNeural network
Neural network
Neural networks
Neural networksNeural networks
Neural networks
Artificial neural networks and its applications
Artificial neural networks and its applications Artificial neural networks and its applications
Artificial neural networks and its applications
Artificial Neural Network.pptx
Artificial Neural Network.pptxArtificial Neural Network.pptx
Artificial Neural Network.pptx

What's hot (20)

Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
Neural network
Neural networkNeural network
Neural network
Artifical Neural Network
Artifical Neural NetworkArtifical Neural Network
Artifical Neural Network
(Artificial) Neural Network
(Artificial) Neural Network(Artificial) Neural Network
(Artificial) Neural Network
Artificial Neural Networks for NIU
Artificial Neural Networks for NIUArtificial Neural Networks for NIU
Artificial Neural Networks for NIU
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17
Neural network
Neural networkNeural network
Neural network
Artificial Neuron network
Artificial Neuron network Artificial Neuron network
Artificial Neuron network
Artificial nueral network slideshare
Artificial nueral network slideshareArtificial nueral network slideshare
Artificial nueral network slideshare
Neural network and mlp
Neural network and mlpNeural network and mlp
Neural network and mlp
Neural network
Neural networkNeural network
Neural network
neural-networks (1)
neural-networks (1)neural-networks (1)
neural-networks (1)
Neural Networks
Neural NetworksNeural Networks
Neural Networks
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
Neural networks
Neural networksNeural networks
Neural networks
Application of artificial_neural_network
Application of artificial_neural_networkApplication of artificial_neural_network
Application of artificial_neural_network
Neural network
Neural networkNeural network
Neural network
Neural networks
Neural networksNeural networks
Neural networks
Artificial neural networks and its applications
Artificial neural networks and its applications Artificial neural networks and its applications
Artificial neural networks and its applications
Artificial Neural Network.pptx
Artificial Neural Network.pptxArtificial Neural Network.pptx
Artificial Neural Network.pptx

Similar to Neural networks.cheungcannonnotes

Neural network
Neural networkNeural network
Neural network
KRISH na TimeTraveller
Knoldus Inc.
Ann model and its application
Ann model and its applicationAnn model and its application
Ann model and its application
Basics of Artificial Neural Network
Basics of Artificial Neural Network Basics of Artificial Neural Network
Basics of Artificial Neural Network
Subham Preetam
ENNEoS Presentation - HackMiami
ENNEoS Presentation - HackMiamiENNEoS Presentation - HackMiami
ENNEoS Presentation - HackMiami
Drew Kirkpatrick
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13
Dr. Muhammad Ali Tirmizi., Ph.D.
nn_important study materoial okfjevh rji
nn_important study materoial okfjevh rjinn_important study materoial okfjevh rji
nn_important study materoial okfjevh rji
ENNEoS Presentation - CackalackyCon
ENNEoS Presentation - CackalackyConENNEoS Presentation - CackalackyCon
ENNEoS Presentation - CackalackyCon
Drew Kirkpatrick
machine learning alg.pptx
machine learning alg.pptxmachine learning alg.pptx
machine learning alg.pptx
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
deep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkdeep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural network
Jaey Jeong
neural network
neural networkneural network
neural network
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
Vishwas N
Deep Neural Networks.pptx
Deep Neural Networks.pptxDeep Neural Networks.pptx
Deep Neural Networks.pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptxartificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptx

Similar to Neural networks.cheungcannonnotes (20)

Neural network
Neural networkNeural network
Neural network
Ann model and its application
Ann model and its applicationAnn model and its application
Ann model and its application
Basics of Artificial Neural Network
Basics of Artificial Neural Network Basics of Artificial Neural Network
Basics of Artificial Neural Network
ENNEoS Presentation - HackMiami
ENNEoS Presentation - HackMiamiENNEoS Presentation - HackMiami
ENNEoS Presentation - HackMiami
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13
nn_important study materoial okfjevh rji
nn_important study materoial okfjevh rjinn_important study materoial okfjevh rji
nn_important study materoial okfjevh rji
ENNEoS Presentation - CackalackyCon
ENNEoS Presentation - CackalackyConENNEoS Presentation - CackalackyCon
ENNEoS Presentation - CackalackyCon
machine learning alg.pptx
machine learning alg.pptxmachine learning alg.pptx
machine learning alg.pptx
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
deep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkdeep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural network
neural network
neural networkneural network
neural network
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
Deep Neural Networks.pptx
Deep Neural Networks.pptxDeep Neural Networks.pptx
Deep Neural Networks.pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptxartificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptx

Recently uploaded

Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving

Recently uploaded (20)

Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024

Neural networks.cheungcannonnotes

  • 1. An Introduction to Neural Networks Vincent Cheung Kevin Cannons Signal & Data Compression Laboratory Electrical & Computer Engineering University of Manitoba Winnipeg, Manitoba, Canada Advisor: Dr. W. Kinsner May 27, 2002
  • 2. Neural Networks Outline ● Fundamentals ● Classes ● Design and Verification ● Results and Discussion ● Conclusion Cheung/Cannons 1
  • 3. Fundamentals Neural Networks What Are Artificial Neural Networks? ● An extremely simplified model of the brain Classes ● Essentially a function approximator ► Transforms inputs into outputs to the best of its ability Design Inputs Outputs Results Inputs NN Outputs Cheung/Cannons 2
  • 4. Fundamentals Neural Networks What Are Artificial Neural Networks? ● Composed of many “neurons” that co-operate Classes to perform the desired function Design Results Cheung/Cannons 3
  • 5. Fundamentals Neural Networks What Are They Used For? ● Classification Classes ► Pattern recognition, feature extraction, image matching ● Noise Reduction ► Recognize patterns in the inputs and produce Design noiseless outputs ● Prediction ► Extrapolation based on historical data Results Cheung/Cannons 4
  • 6. Fundamentals Neural Networks Why Use Neural Networks? ● Ability to learn Classes ► NN’s figure out how to perform their function on their own ► Determine their function based only upon sample inputs ● Ability to generalize i.e. produce reasonable outputs for inputs it has not been Design ► taught how to deal with Results Cheung/Cannons 5
  • 7. Fundamentals Neural Networks How Do Neural Networks Work? ● The output of a neuron is a function of the Classes weighted sum of the inputs plus a bias i1 w1 w2 i2 w3 Neuron Output = f(i1w1 + i2w2 + i3w3 + bias) Design i3 bias Results ● The function of the entire neural network is simply the computation of the outputs of all the neurons ► An entirely deterministic calculation Cheung/Cannons 6
  • 8. Fundamentals Neural Networks Activation Functions ● Applied to the weighted sum of the inputs of a Classes neuron to produce the output ● Majority of NN’s use sigmoid functions ► Smooth, continuous, and monotonically increasing (derivative is always positive) Design ► Bounded range - but never reaches max or min ■ Consider “ON” to be slightly less than the max and “OFF” to be slightly greater than the min Results Cheung/Cannons 7
  • 9. Fundamentals Neural Networks Activation Functions ● The most common sigmoid function used is the Classes logistic function ► f(x) = 1/(1 + e-x) ► The calculation of derivatives are important for neural networks and the logistic function has a very nice Design derivative ■ f’(x) = f(x)(1 - f(x)) ● Other sigmoid functions also used hyperbolic tangent Results ► ► arctangent ● The exact nature of the function has little effect on the abilities of the neural network Cheung/Cannons 8
  • 10. Fundamentals Neural Networks Where Do The Weights Come From? ● The weights in a neural network are the most Classes important factor in determining its function ● Training is the act of presenting the network with some sample data and modifying the weights to better approximate the desired function Design ● There are two main types of training ► Supervised Training ■ Supplies the neural network with inputs and the desired Results outputs ■ Response of the network to the inputs is measured The weights are modified to reduce the difference between the actual and desired outputs Cheung/Cannons 9
  • 11. Fundamentals Neural Networks Where Do The Weights Come From? ► Unsupervised Training Classes ■ Only supplies inputs ■ The neural network adjusts its own weights so that similar inputs cause similar outputs The network identifies the patterns and differences in the inputs without any external assistance Design ● Epoch ■ One iteration through the process of providing the network with an input and updating the network's weights ■ Typically many epochs are required to train the neural Results network Cheung/Cannons 10
  • 12. Fundamentals Neural Networks Perceptrons ● First neural network with the ability to learn Classes ● Made up of only input neurons and output neurons ● Input neurons typically have two states: ON and OFF ● Output neurons use a simple threshold activation function Design ● In basic form, can only solve linear problems ► Limited applications .5 Results .2 .8 Input Neurons Weights Output Neuron Cheung/Cannons 11
  • 13. Fundamentals Neural Networks How Do Perceptrons Learn? ● Uses supervised training Classes ● If the output is not correct, the weights are adjusted according to the formula: ■ wnew = wold + α(desired – output)*input α is the learning rate Design 1 0.5 0.2 0 1 1 0.8 Results Assume Output was supposed to be 0 update the weights 1 * 0.5 + 0 * 0.2 + 1 * 0.8 = 1.3 Assuming Output Threshold = 1.2 Assume α = 1 1.3 > 1.2 W 1new = 0.5 + 1*(0-1)*1 = -0.5 W 2new = 0.2 + 1*(0-1)*0 = 0.2 W 3new = 0.8 + 1*(0-1)*1 = -0.2 Cheung/Cannons 12
  • 14. Fundamentals Neural Networks Multilayer Feedforward Networks ● Most common neural network Classes ● An extension of the perceptron ► Multiple layers ■ The addition of one or more “hidden” layers in between the input and output layers Design ► Activation function is not simply a threshold ■ Usually a sigmoid function ► A general function approximator ■ Not limited to linear problems Results ● Information flows in one direction ► The outputs of one layer act as inputs to the next layer Cheung/Cannons 13
  • 15. Fundamentals Neural Networks XOR Example Classes 0 Inputs Output Design 1 Inputs: 0, 1 Results H1: Net = 0(4.83) + 1(-4.83) – 2.82 = -7.65 H2: Net = 0(-4.63) + 1(4.6) – 2.74 = 1.86 Output = 1 / (1 + e7.65) = 4.758 x 10-4 Output = 1 / (1 + e-1.86) = 0.8652 O: Net = 4.758 x 10-4(5.73) + 0.8652(5.83) – 2.86 = 2.187 Output = 1 / (1 + e-2.187) = 0.8991 ≡ “1” Cheung/Cannons 14
  • 16. Fundamentals Neural Networks Backpropagation ● Most common method of obtaining the many Classes weights in the network ● A form of supervised training ● The basic backpropagation algorithm is based on Design minimizing the error of the network using the derivatives of the error function ► Simple ► Slow Results ► Prone to local minima issues Cheung/Cannons 15
  • 17. Fundamentals Neural Networks Backpropagation ● Most common measure of error is the mean Classes square error: E = (target – output)2 ● Partial derivatives of the error wrt the weights: Design ► Output Neurons: let: δj = f’(netj) (targetj – outputj) j = output neuron ∂E/∂wji = -outputi δj i = neuron in last hidden Results ► Hidden Neurons: j = hidden neuron let: δj = f’(netj) Σ(δkwkj) i = neuron in previous layer ∂E/∂wji = -outputi δj k = neuron in next layer Cheung/Cannons 16
  • 18. Fundamentals Neural Networks Backpropagation ● Calculation of the derivatives flows backwards Classes through the network, hence the name, backpropagation ● These derivatives point in the direction of the maximum increase of the error function Design ● A small step (learning rate) in the opposite direction will result in the maximum decrease of the (local) error function: Results wnew = wold – α ∂E/∂wold where α is the learning rate Cheung/Cannons 17
  • 19. Fundamentals Neural Networks Backpropagation ● The learning rate is important Classes ► Too small ■ Convergence extremely slow ► Too large ■ May not converge Design ● Momentum ► Tends to aid convergence ► Applies smoothed averaging to the change in weights: Results ∆new = β∆old - α ∂E/∂wold β is the momentum coefficient wnew = wold + ∆new ► Acts as a low-pass filter by reducing rapid fluctuations Cheung/Cannons 18
  • 20. Fundamentals Neural Networks Local Minima ● Training is essentially minimizing the mean square Classes error function ► Key problem is avoiding local minima ► Traditional techniques for avoiding local minima: ■ Simulated annealing Design Perturb the weights in progressively smaller amounts ■ Genetic algorithms Use the weights as chromosomes Apply natural selection, mating, and mutations to these chromosomes Results Cheung/Cannons 19
  • 21. Fundamentals Neural Networks Counterpropagation (CP) Networks ● Another multilayer feedforward network Classes ● Up to 100 times faster than backpropagation ● Not as general as backpropagation ● Made up of three layers: Design ► Input ► Kohonen ► Grossberg (Output) Results Inputs Input Kohonen Grossberg Outputs Layer Layer Layer Cheung/Cannons 20
  • 22. Fundamentals Neural Networks How Do They Work? ● Kohonen Layer: Classes ► Neurons in the Kohonen layer sum all of the weighted inputs received ► The neuron with the largest sum outputs a 1 and the other neurons output 0 Design ● Grossberg Layer: ► Each Grossberg neuron merely outputs the weight of the connection between itself and the one active Kohonen neuron Results Cheung/Cannons 21
  • 23. Fundamentals Neural Networks Why Two Different Types of Layers? ● More accurate representation of biological neural Classes networks ● Each layer has its own distinct purpose: ► Kohonen layer separates inputs into separate classes ■ Inputs in the same class will turn on the same Kohonen Design neuron ► Grossberg layer adjusts weights to obtain acceptable outputs for each class Results Cheung/Cannons 22
  • 24. Fundamentals Neural Networks Training a CP Network ● Training the Kohonen layer Classes ► Uses unsupervised training ► Input vectors are often normalized ► The one active Kohonen neuron updates its weights according to the formula: Design wnew = wold + α(input - wold) where α is the learning rate Results ■ The weights of the connections are being modified to more closely match the values of the inputs ■ At the end of training, the weights will approximate the average value of the inputs in that class Cheung/Cannons 23
  • 25. Fundamentals Neural Networks Training a CP Network ● Training the Grossberg layer Classes ► Uses supervised training ► Weight update algorithm is similar to that used in backpropagation Design Results Cheung/Cannons 24
  • 26. Fundamentals Neural Networks Hidden Layers and Neurons ● For most problems, one layer is sufficient Classes ● Two layers are required when the function is discontinuous ● The number of neurons is very important: Design ► Too few ■ Underfit the data – NN can’t learn the details ► Too many ■ Overfit the data – NN learns the insignificant details Results ► Start small and increase the number until satisfactory results are obtained Cheung/Cannons 25
  • 27. Fundamentals Neural Networks Overfitting Classes Design Training Test W ell fit Overfit Results Cheung/Cannons 26
  • 28. Fundamentals Neural Networks How is the Training Set Chosen? ● Overfitting can also occur if a “good” training set is Classes not chosen ● What constitutes a “good” training set? ► Samples must represent the general population Samples must contain members of each class Design ► ► Samples in each class must contain a wide range of variations or noise effect Results Cheung/Cannons 27
  • 29. Fundamentals Neural Networks Size of the Training Set ● The size of the training set is related to the Classes number of hidden neurons ► Eg. 10 inputs, 5 hidden neurons, 2 outputs: ► 11(5) + 6(2) = 67 weights (variables) ► If only 10 training samples are used to determine these Design weights, the network will end up being overfit ■ Any solution found will be specific to the 10 training samples ■ Analogous to having 10 equations, 67 unknowns you Results can come up with a specific solution, but you can’t find the general solution with the given information Cheung/Cannons 28
  • 30. Fundamentals Neural Networks Training and Verification ● The set of all known samples is broken into two Classes orthogonal (independent) sets: ► Training set ■ A group of samples used to train the neural network ► Testing set Design ■ A group of samples used to test the performance of the neural network ■ Used to estimate the error rate Results Known Samples Training Testing Set Set Cheung/Cannons 29
  • 31. Fundamentals Neural Networks Verification ● Provides an unbiased test of the quality of the Classes network ● Common error is to “test” the neural network using the same samples that were used to train the neural network Design ► The network was optimized on these samples, and will obviously perform well on them ► Doesn’t give any indication as to how well the network will be able to classify inputs that weren’t in the training Results set Cheung/Cannons 30
  • 32. Fundamentals Neural Networks Verification ● Various metrics can be used to grade the Classes performance of the neural network based upon the results of the testing set ► Mean square error, SNR, etc. ● Resampling is an alternative method of estimating Design error rate of the neural network ► Basic idea is to iterate the training and testing procedures multiple times Results ► Two main techniques are used: ■ Cross-Validation ■ Bootstrapping Cheung/Cannons 31
  • 33. Fundamentals Neural Networks Results and Discussion ● A simple toy problem was used to test the operation of a perceptron Classes ● Provided the perceptron with 5 pieces of information about a face – the individual’s hair, eye, nose, mouth, and ear type Design ► Each piece of information could take a value of +1 or -1 ■ +1 indicates a “girl” feature ■ -1 indicates a “guy” feature ● The individual was to be classified as a girl if the Results face had more “girl” features than “guy” features and a boy otherwise Cheung/Cannons 32
  • 34. Fundamentals Neural Networks Results and Discussion ● Constructed a perceptron with 5 inputs and 1 Classes output Design Face Input Output Output value Feature neurons neuron indicating Input boy or girl Values Results ● Trained the perceptron with 24 out of the 32 possible inputs over 1000 epochs ● The perceptron was able to classify the faces that were not in the training set Cheung/Cannons 33
  • 35. Fundamentals Neural Networks Results and Discussion ● A number of toy problems were tested on multilayer feedforward NN’s with a single hidden Classes layer and backpropagation: ► Inverter ■ The NN was trained to simply output 0.1 when given a “1” and 0.9 when given a “0” Design A demonstration of the NN’s ability to memorize ■ 1 input, 1 hidden neuron, 1 output ■ With learning rate of 0.5 and no momentum, it took about 3,500 epochs for sufficient training ■ Including a momentum coefficient of 0.9 reduced the Results number of epochs required to about 250 Cheung/Cannons 34
  • 36. Fundamentals Neural Networks Results and Discussion ► Inverter (continued) Classes ■ Increasing the learning rate decreased the training time without hampering convergence for this simple example ■ Increasing the epoch size, the number of samples per epoch, decreased the number of epochs required and seemed to aid in convergence (reduced fluctuations) Design ■ Increasing the number of hidden neurons decreased the number of epochs required Allowed the NN to better memorize the training set – the goal of this toy problem Not recommended to use in “real” problems, since the NN Results loses its ability to generalize Cheung/Cannons 35
  • 37. Fundamentals Neural Networks Results and Discussion ► AND gate Classes ■ 2 inputs, 2 hidden neurons, 1 output ■ About 2,500 epochs were required when using momentum ► XOR gate ■ Same as AND gate Design ► 3-to-8 decoder ■ 3 inputs, 3 hidden neurons, 8 outputs ■ About 5,000 epochs were required when using momentum Results Cheung/Cannons 36
  • 38. Fundamentals Neural Networks Results and Discussion ► Absolute sine function approximator (|sin(x)|) Classes ■ A demonstration of the NN’s ability to learn the desired function, |sin(x)|, and to generalize ■ 1 input, 5 hidden neurons, 1 output ■ The NN was trained with samples between –π/2 and π/2 The inputs were rounded to one decimal place Design The desired targets were scaled to between 0.1 and 0.9 ■ The test data contained samples in between the training samples (i.e. more than 1 decimal place) The outputs were translated back to between 0 and 1 Results ■ About 50,000 epochs required with momentum ■ Not smooth function at 0 (only piece-wise continuous) Cheung/Cannons 37
  • 39. Fundamentals Neural Networks Results and Discussion 2 ► Gaussian function approximator (e-x ) Classes ■ 1 input, 2 hidden neurons, 1 output ■ Similar to the absolute sine function approximator, except that the domain was changed to between -3 and 3 ■ About 10,000 epochs were required with momentum ■ Smooth function Design Results Cheung/Cannons 38
  • 40. Fundamentals Neural Networks Results and Discussion ► Primality tester Classes ■ 7 inputs, 8 hidden neurons, 1 output ■ The input to the NN was a binary number ■ The NN was trained to output 0.9 if the number was prime and 0.1 if the number was composite Classification and memorization test Design ■ The inputs were restricted to between 0 and 100 ■ About 50,000 epochs required for the NN to memorize the classifications for the training set No attempts at generalization were made due to the Results complexity of the pattern of prime numbers ■ Some issues with local minima Cheung/Cannons 39
  • 41. Fundamentals Neural Networks Results and Discussion ► Prime number generator ■ Provide the network with a seed, and a prime number of the Classes same order should be returned ■ 7 inputs, 4 hidden neurons, 7 outputs ■ Both the input and outputs were binary numbers ■ The network was trained as an autoassociative network Design Prime numbers from 0 to 100 were presented to the network and it was requested that the network echo the prime numbers The intent was to have the network output the closest prime number when given a composite number ■ After one million epochs, the network was successfully able Results to produce prime numbers for about 85 - 90% of the numbers between 0 and 100 ■ Using Gray code instead of binary did not improve results ■ Perhaps needs a second hidden layer, or implement some heuristics to reduce local minima issues Cheung/Cannons 40
  • 42. Neural Networks Conclusion ● The toy examples confirmed the basic operation of neural networks and also demonstrated their ability to learn the desired function and generalize when needed ● The ability of neural networks to learn and generalize in addition to their wide range of applicability makes them very powerful tools Cheung/Cannons 41
  • 43. Neural Networks Questions and Comments Cheung/Cannons 42
  • 44. Neural Networks Acknowledgements ● Natural Sciences and Engineering Research Council (NSERC) ● University of Manitoba Cheung/Cannons 43
  • 45. Neural Networks References [AbDo99] H. Abdi, D. Valentin, B. Edelman, Neural Networks, Thousand Oaks, CA: SAGE Publication Inc., 1999. [Hayk94] S. Haykin, Neural Networks, New York, NY: Nacmillan College Publishing Company, Inc., 1994. [Mast93] T. Masters, Practial Neural Network Recipes in C++, Toronto, ON: Academic Press, Inc., 1993. [Scha97] R. Schalkoff, Artificial Neural Networks, Toronto, ON: the McGraw-Hill Companies, Inc., 1997. [WeKu91] S. M. Weiss and C. A. Kulikowski, Computer Systems That Learn, San Mateo, CA: Morgan Kaufmann Publishers, Inc., 1991. [Wass89] P. D. Wasserman, Neural Computing: Theory and Practice, New York, NY: Van Nostrand Reinhold, 1989. Cheung/Cannons 44