pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APILCC HOME | CLASSES | CONTACT US SEARCH | A - Z | QUICK FINDINTRO NEURAL NETWORKSAn Introduction to Neural Networks:The PerceptronFeedback
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIThe human brain is essentially a large and unimaginably complex Neural Network. We can also thinkof the brain as an organized series of interconnected subsections of Neural Networks. We will look athow nature has implemented the Neural Network, and then look at the workings of the most commonartificial Neural Network, the PerceptronNeurons
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIThe Neural Network of a mature human brain contains about 100 billion nerve cells called neurons.These neurons are the fundamental part of the Neural Network. Neurons form complex networks ofinterconnections, called synapses, with each other. A typical neuron can interconnect with up to10,000 other neurons, with the average neuron interconnecting with about 1,000 other neurons.Synapse of Interconnecting Neurons
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIFor a more detailed information on the synaptic interconnections between neurons at the microscopiclevel, there are interesting animations to be found at:Chemical SynapseThe Mind ProjectBrain Basics - Firing of NeuronsThe Biological Neural Network
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIAltho the mechanisms of synapse itself is compelling, the focus of this article is the Neural Networkitself, and particularly, the Perceptron. The Perceptron is a simple and common configuration of anartificial Neural Network. We will start with a brief history of Artificial Intelligence and the Perceptronitself.Artificial Intelligence - Mimicking the Human Brain
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIWith the advent of electronic computers in the 1940s, people beganto think about the possibility of artificial brains, or what is commonlyknown as Artificial Intelligence. In the beginning, some thought thatthe logic gate, the building block of digital computers, could serve asan artificial neuron, but this idea was quickly rejected. In 1949, DonaldHebb proposed an artificial neuron that more closely mimicked thebiological neuron, where each neuron would have numerousinterconnections with other neurons. Each of these interconnectswould have a weight multiplier associated with it. Learning would beachieved by changing the weight multipliers of each of theinterconnections. In 1957, Frank Rosenblatt implemented a Hebbneuron, which he called a Perceptron.In 1974, Paul Werbos in hisPhD thesis first described theprocess of training artificialneural networks through a process called the"Backpropagation of Errors". Just as Frank Rosenblattdeveloped the ideas of Donald Hebb, in 1986 David E.Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams tookthe idea of Paul Werbos and developed a practicalBackpropagation algorithm, which led to a renaissance in thefield of artificial neural network research.Where that renaissance has led is to a new Perceptron, amulti-layered Perceptron. The multi-layered Perceptron oftoday is now synonomous with the term Perceptron, and has also become synonomous with theterm Neural Network itself.The Feed-Forward Multi-Layered Perceptron
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIWhat makes the modern Feed-Forward Multi-Layered Perceptron so powerful is that is essentiallyteaches itself by using the Backpropagation learning algorithm. We will look into how the Multi-Layered perceptron works, and the process by which it teaches itself using Backpropagation.Neural Networks Using the Multi-Layered PerceptronNASA: A Prediction of Plant Growth in Space
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIObviously, anything done in space must be done a efficient as possible. To optimize plant growth,NASA created this Perceptron, taught with actual data, to simulate different growth environments.Mayo Clinic: A Tumor Classifier
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIThe above perceptron is self-explanitory. A perceptron need not be complex to be useful.An Early Commercial Use: The Original Palm Pilot
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIAltho it may seem awkward now since most cell phones have a full keyboard, the early Palm Pilot useda stylus, or electronic pen to enter in characters freehand. It used a perceptron to learn to read aparticular users handwriting. An unforseen popular use of the Palm Pilot was for anthopologists to useit to enter script from ancient languages that they transcribed from ancient stone and clay tablets.
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIPapnet: Assisted Screening of Pap SmearsPapnet is a commercial neural network-based computer program for assisted screening of Pap(cervical) smears. A Pap smear test examines cells taken from the uterine cervix for signs ofprecancerous and cancerous changes. A properly taken and analysed Pap smear can detect very earlyprecancerous changes. These precancerous cells can then be eliminated, usually in a relatively simpleoffice or outpatient procedure.Type These Characters: The anti-NeuralNet Application
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIYou may have seen the kind of prompt shown above when logging on to some web site. Its purpose isto disguise a sequence of letters so a Neural Net cannot read it. With this readable-only-by-a-humansafeguard, web sites are protected against other computers entering these sites via exhaustiveattempts at passwords.The Individual Nodes of the Multi-Layered PerceptronSince the modern Perceptron is a Neural Network in itself, to understand it we need to go back toits basic building block, the artificial neuron. As was stated, the original perceptron served as theartificial neuron. We will call what serves today as the artificial neuron the Threshold Logic Unit, orTLU.
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF API
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIThe original Hebb neuron would sum all of its inputs. Each input in turn was the product of anexternal input times that external inputs cooresponding weight multiplier. The Threshold Logic Unit,or TLU, adds a significant feature to the original Hebb neuron, the Activation Function. TheActivation Function takes as input the sum of what is now called Input Function, which isessentially the Hebb neuron, and scales it to a value between 0 and 1.The selection of the mathematical function that implements the Activation Function is a pivotaldesign decision. Not only does it control the mapping if the Input Functions sum to a valuebetween 0 and 1, its selection directly affects the development of the Perceptrons ability to teachitself, as we will see shortly. A common Activation Function that we will use is the sigmoidfunction.With the sigmoid Activation Function, each TLU will now output a value between 0 and 1. This 0-1valued output, coupled with a weight multiplier with a value between -1 and +1, will keep values within
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIthe network to a manageable level. With the Activation Function, the TLU more closely mimics theoperation of the neuron.Teaching the Perceptron Using BackpropagationLets start with a very simple perceptron with 2 Inputs and 2 outputs, as shown below. As ourperceptron receives Input 1 and Input 2 it responds with the values of its ouputs, Output 1 andOutput 2.The process of teaching this perceptron consists of giving it a series of input pairs, and thencomparing the actual output values that were generated with the desired output values thatcorrespond to each pair of inputs. Based upon the difference between the actual output values andthe desired output values, adjustments are made.The only things that are changed during training are the weight multipliers. Consequently, theprocess of teaching the perceptron is a matter of changing the weight multipliers until the actual
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIoutputs are as close as possible to the desired outputs. As we stated earlier, the perceptron usesthe process of Backpropagation to change its weight multipliers, and thus teach itself.Mathematics of Learning via Minimizing ErrorLocating Output 1 in our simple perceptron, we can see that it has 2 inputs. Those inputs to Output 1are the outputs from each of the TLUs in the Hidden Layer. The Hidden Layer outputs are eachmultiplied by their corresponding weigth multipliers, wo11 and wo21. The weights are identified bytheir source and destination TLUs. For example, the ID wo21 stands for weight to an ouput fromHidden node 2 to Output node 1.Looking again at the weigth multipliers of the 2 inputs to Output 1, wo11 and wo21, we can make a3 dimensional graph where the x-axis corresponds to the value of the wo11 weight multiplier and they-axis corresponds to the value of the wo21 weight multiplier. The meaning of the height, or z-axiswill follow.
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIAltho weights multipliers can be negative, we will consider only the possible positive values for theseweights, which are between 0 and 1. For a given value of the 2 coordinates (wo11, wo21), there isan associated amount of difference between the desired value of Output 1 and the actual value. Thisdifference we will call the delta, and is the value of the height in the z-axis.At the ideal values of wo11 and wo21, the Delta is zero, so the height is zero. The farther anypair of wo11 and wo21 values are from the ideal values, the height of the delta (size of the error)increases. The result is that this 3D graph forms a bowl or funnel shape, with the ideal values of
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIwo11 and wo21 at the bottom.So any time we find ourselves at some point (wo11, wo21) on the graph that is not the idealpoint, we will want to slide downhill toward the bottom of our virtual bowl. Differential Calculus givesus this ability with the Gradient. The mathematical function for the Gradient is:Yikes! Fortunately, we dont have to worry about the particulars. We just need to know thatmathematically, once we have identified a non-ideal pair of values for wo11 and wo21 in our virtualbowl, we have a means of determining the direction to go to get closer to the ideal values.Now we can get a feel for the theoretical "Backpropagation of Errors" process that Paul Werbosdescribed in 1974. We will now go into the steps of the actual process that was finally implementedby a team in 1986. Obviously, it was not a trivial task.In short, the team of 1986 constructed a generalization of the complicated mathematics for ageneric perceptron, and then simplified that mathematical process down into simple parts. Thismathematical simplification was essentially doing on a very large scale what we do on a small scalewhen we simplify a fraction to lowest terms.To imagine the level of complexity of the original model, consider that in our simple perceptron,Output 1 has only 2 inputs. These 2 inputs form a 3 dimensional graph. To model up to n inputs,mathematicians had to imagine a virtual bowl in n+1 dimensional hyperspace. Then they had todescribe an n+1 dimensional gradient.After all this complexity, the first major simplification was to eliminate the n+1 dimensionalhyperspace. Differential Calculus is still involved, but only in 2 dimensions with 1 independentvariable. So the adjustment to each incoming weight multiplier could be considered independently.Taking another look at our perceptron:
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIWe can now mathematically adjust out weights to Output 1 with the following equations:
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIThe Learning rate constant η is a fractional multiplier used to limit the size of any particularadjustment. With computers the perceptron can go thru the same learning sequence over and overagain. So each adjustment can be small. If adjustments are too large, the adjustments mightovercompensate and the weights would just oscillate back and forth.Whew! Things are now much better now than they were with partial derivatives in hyperspace, butthere is still the matter of the derivative of the sigmoid function, which is our TLUs Activationfunction. As stated earlier, using the sigmoid function was a pivotal design decision. By looking atits derivative, we can see why:
pdfcrowd.comopen in browser PRO version Are you a developer? Try out the HTML to PDF APIConsequently, the derivative of the sigmoid function, or Activation function, becomes a simplealgebraic expression of the value of the function itself. Since the value of the Activation function, whichwe call y, has to be computed anyway to determine the output value of any TLU, the derivative termbecomes trivial.With this simple derivative term, once we compute y, the adjustment to wo21 becomes the simplealgebraic expression: