Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Neural Networks on Steroids (Poster)
1. Autocorrect and Neural Networks
Adam Blevins
Dr. Kasper Peeters
Basic Idea: A Neural Network is similar to Artificial Intelligence - it can teach itself like autocorrect.
Input = miktex
Network
Calculations
Output = Molten
Target = MikTeX
Network
Learning Occurs
Trained Output
= MikTeX
You type ”miktex” into your phone and it autocorrects to ”molten”. You then delete molten and once again type MikTeX as
required and the phone learns the word MikTeX as the correction for miktex for the future.
What is a Neural Network?
An Artificial Neural Network (ANN) is effectively a computer
program that learns to interpret large amounts of data. We
present the network with a training set, a learning algorithm
makes corrections to the calculations within and this repeats
until the network is suitably trained. One of the simplest
examples of an ANN takes the following form:
x
b
σ
σ
b
Inputs Hidden Units Output Unit
w
f(x)
Figure 1: A small ANN architecture example
The circles are called nodes and the text within them defines
their outputs. x is the input with f(x) the respective network
output, b is a constant and σ is a sigmoid function for example,
tanh(y) where the input to the node is y. w represents the
weight of a connection between nodes where each arrow
represents a weighted connection. A network may have more
units in each layer and may have many more layers to allow for
more complex calculations.
Uses of Neural Networks
In addition to their use with autocorrect and ability to discover
patterns in large data sets, Neural Networks are also commonly
used for:
1. Image recognition, particularly facial recognition, for
example identifying a criminal using a database of
mugshots.
2. Character recognition which is very popular nowadays,
especially within technology like the Galaxy Note in which
writing with a stylus is commonplace.
3. Function interpolation
The Network Calculations
Each node has a set of connections to nodes from the previous
layer. Defining the output of a node as yi, then the input to a
node j is the sum of the outputs multiplied by their respective
weighted connection:
inputj =
i
yiwji (1)
An example output yi of node i for the sigmoid nodes in Figure
1 is tanh(inputi). Now the next important thing is the
learning algorithm. The most common algorithm is called the
Backpropagation algorithm [1]. The weights are changed with
regards to the following equation:
∆wji = η
∂E
∂wji
(2)
where η is a constant which controls the magnitude of change
and E is the error as a function of the weights. The error
function E has many local minima and for an accurate system
we want to converge on the global minimum for smallest error.
The Error Function
The error depends upon the weights of the network. If we
consider a very simple network in which the error only depends
on one weight, we can imagine it looks something like this:
Figure 2: An error function dependent on one weight
To maximise the accuracy of the trained network, the
Backpropagation algorithm needs to converge on the global
minimum. This depends on a number of factors, not least the
learning rate η from Equation (2). If our first error is stuck in
the local minima, then for η too small, the weight change could
be too small to escape the local minima. If η is too big we
could jump over the global minimum entirely and cause
divergence. We want a method of initialising weights that gives
us the greatest chance of reaching the global minimum.
Pre-training a Neural Network
Pre-training is a method of finding initial weights for the
Neural Network before normal training. The common
technique uses autoencoders. Autoencoders take 2 consecutive
layers, beginning with the input and first hidden layer, and use
the Backpropagation algorithm to train this subnetwork. The
autoencoder mirrors the leftmost layer (represented by the
magenta) to function as shown:
Figure 3: An example autoencoder with two input nodes and
one hidden node
The original training data X is used, and the fewer nodes in
the hidden layer allow for dimension reduction, providing a
simplified output of X, say set Y . Y holds the key
characteristics of set X. The next two layers are trained
similarly using set Y and so on until the entire network is
pre-trained. The network is then rebuilt with the pre-trained
weights. This allows a starting set of weights closer to the
global minimum which gives a greater chance of convergence, as
well as a greater rate of convergence.
Recommended Further Reading
The work of Geoffrey Hinton, a leading researcher in Neural
Nets (whom gave significant contribution to the understanding
of the Backpropagation algorithm alongside David Rumelhart
and Ronald Williams in 1985 [2]) would be encouraged.
References
[1] Michael Nielsen, 2014,
http://neuralnetworksanddeeplearning.com/chap2.html
[2] Rumelhart et. al, 1985, Learning internal representations
by error propagation.