SlideShare a Scribd company logo
University of Dhaka
Tanvir Siddike Moin
 Artificial Neural Networks (ANN)
 Information processing paradigm inspired by biological nervous
systems
 ANN is composed of a system of neurons connected by
synapses
 ANN learn by example
 Adjust synaptic connections between neurons
1943: McCulloch and Pitts model neural
networks based on their understanding of
neurology.
 Neurons embed simple logic functions:
 a or b
 a and b
1950s:
 Farley and Clark
 IBM group that tries to model biological behavior
 Consult neuro-scientists at McGill, whenever stuck
 Rochester, Holland, Haibit and Duda
Perceptron (Rosenblatt 1958)
 Three layer system:
 Input nodes
 Output node
 Association layer
 Can learn to connect or associate a given input
to a random output unit
Minsky and Papert
 Showed that a single layer perceptron cannot
learn the XOR of two binary inputs
 Lead to loss of interest (and funding) in the
field
Perceptron (Rosenblatt 1958)
 Association units A1, A2, … extract features from
user input
 Output is weighted and associated
 Function fires if weighted sum of input exceeds a
threshold.
Back-propagation learning method
(Werbos 1974)
 Three layers of neurons
 Input, Output, Hidden
 Better learning rule for generic three layer
networks
 Regenerates interest in the 1980s
Successful applications in medicine,
marketing, risk management, … (1990)
In need for another breakthrough.
Applications of Artificial Neural Network in
Footwear Industry
• An ANN is used for the prediction of copolymer composition
very correctly, as a function of reaction conditions and
conversions.
• Classification of the leather fibers is one of the most typical
problems. It has been resolved successfully using ANNs.
• ANN is used in leather and cotton grading more perfectly.
• ANNs have supported to identify the production control
parameters and the prediction of the properties of the melt
spun fibers in the case of synthetic fibers.
• ANNs have been used in conjunction with NIR` spectroscopy
for the identification of the leather and textile fibers.
Applications of Artificial Neural Network in
Footwear Industry
Application of ANN in leather and fabrics for shoe upper is summarized
below:
• ANN is used to inspect leather and fabric for the detection of faults.
• Fabrics can be engineered either by weaving, knitting or bonding. Neural
networks are successfully implemented in all three to optimise the input
parameters.
• The detection and recognition of the patterns on a leather and fabric are of the
same complex category of problems and thus resolved by the implementation of
ANNs.
• ANN is used to the prediction of the leather and fabric drape.
• By the application of ANNs leather and fabric handle become increase.
• ANNs in combination with fuzzy logic have been used in the case of the
prediction of the sensory properties of the leather and fabrics.
• ANNs have been used for the prediction of the tensile strength and for the initial
stress-strain curve of the leather and fabrics.
• Shear stiffness and compression properties of the worsted leather and fabrics
has been successfully modelled.
Applications of Artificial Neural Network in
Footwear Industry
• Prediction of the spirality of the relaxed knitted fabrics as well as knit
global quality and subjective assessment of the knit fabrics have been
implemented using ANNs.
• Prediction of the thermal resistance and the thermal conductivity of the
leather and fabrics have been realized with the help of ANNs.
• Moisture and heat transfer in knitted fabrics have been studied with
ANN modelling successfully.
• Engineering of leather and fabrics used in safety and protection
applications is supported by ANNs.
• Prediction of the leather and fabrics end use is also possible via ANN
method.
• Optimization of the application of a repellent coating has also been
approached by the ANN model.
• Colour measurement, evaluation, comparison and prediction are major
actions in the dyeing and finishing field of the leather and textile
process. It is done by ANNs.
Applications of Artificial Neural Network in
Footwear Industry
• Prediction of bursting strength using ANNs for woven and
knitted fabrics has been achieved with satisfactory results.
• Permeability of the woven fabrics has been modelled using
ANNs as well as, the impact permeability has been studied
and the quality of the neural models has been assessed.
• Pilling propensity of the leather and fabrics has been
predicted and the pilling of the leather and leather and
fabrics has been evaluated.
• The presence of fuzz fibers has been modelled by ANN.
• ANN is used in evaluating wrinkled leather and leather
and fabrics with image analysis.
Applications of Artificial Neural Network in
Footwear Industry
Application of ANNs in footwear:
A large variety of ANN approaches for decision-making problems in
the footwear industry have been proposed in the last two decades,
three major leading areas are summarized below:
• Resolving prediction problem: RGI involves various predictions
during production process, like; leather and fabric manufacturing
performance, sewing thread consumption, fashion sensory comfort,
cutting time, footwear sales, etc.
• Resolving classification problems: It involves classification at
various levels, just for an example; leather and fabric online
classification, leather and fabric defect classification, leather and
fabric end-use classification and seam feature classification etc.
• Model identification problems.
Applications of Artificial Neural Network in
Footwear Industry
• The thread consumption is predicted via an ANN model.
• The seam puckering is evaluated and the sewing thread is optimized
through ANN models, respectively.
• The prediction of the sewing performance is also possible using
ANNs.
• Prediction of the performance of the leather and fabrics in footwear
manufacturing and fit footwear design has been realized based on
ANN systems.
 Promises
 Combine speed of silicon with proven success of carbon 
artificial brains
 Natural neurons
Neuron collects signals from dendrites
Sends out spikes of electrical activity
through an axon, which splits into
thousands of branches.
At end of each brand, a synapses converts
activity into either exciting or inhibiting
activity of a dendrite at another neuron.
Neuron fires when exciting activity
surpasses inhibitory activity
Learning changes the effectiveness of the
synapses
 Natural neurons
 Abstract neuron model:
 Bias Nodes
 Add one node to each layer that has constant output
 Forward propagation
 Calculate from input layer to output layer
 For each neuron:
 Calculate weighted average of input
 Calculate activation function
Firing Rules:
 Threshold rules:
 Calculate weighted average of input
 Fire if larger than threshold
 Perceptron rule
 Calculate weighted average of input input
 Output activation level is














0
0
2
1
0
2
1
1
)
(






 Firing Rules: Sigmoid functions:
 Hyperbolic tangent function
 Logistic activation function
 
)
exp(
1
)
exp(
1
)
2
/
tanh(











 
 






exp
1
1
Apply input vector X to layer of neurons.
Calculate
 where Xi is the activation of previous layer neuron
i
 Wji is the weight of going from node i to node j
 p is the number of neurons in the previous layer
Calculate output activation
  )
(
1
Threshold
X
W
n
V i
p
i ji
j 
 
))
(
exp(
1
1
)
(
n
V
n
Y
j
j



Example: ADALINE Neural Network
 Calculates and of inputs
0
1
2
Bias
Node
3
w0,3=.6
w1,3=.6
w2,3=-.9
threshold function is
step function
Example: Three layer network
 Calculates xor of inputs
0
1
2
3
4
-4.8
-5.2
4.6
5.1
5.9
5.2
-2.7
-2.6
-3.2
Bias
 Input (0,0)
0
1
2
3
4
-4.8
-5.2
4.6
5.1
5.9
5.2
-2.7
-2.6
-3.2
Bias
 Input (0,0)
 Node 2 activation is (-4.8  0+4.6  0 - 2.6)= 0.0691
0
1
2
3
4
-4.8
-5.2
4.6
5.1
5.9
5.2
-2.7
-2.6
-3.2
Bias
 Input (0,0)
 Node 3 activation is (5.1  0 - 5.2  0 - 3.2)= 0.0392
0
1
2
3
4
-4.8
-5.2
4.6
5.1
5.9
5.2
-2.7
-2.6
-3.2
Bias
 Input (0,0)
 Node 4 activation is (5.9  0.069 + 5.2  0.069 – 2.7)= 0.110227
0
1
2
3
4
-4.8
-5.2
4.6
5.1
5.9
5.2
-2.7
-2.6
-3.2
Bias
Input (0,1)
 Node 2 activation is (4.6 -2.6)= 0.153269
0
1
2
3
4
-4.8
-5.2
4.6
5.1
5.9
5.2
-2.7
-2.6
-3.2
Bias
Input (0,1)
 Node 3 activation is (-5.2 -3.2)= 0.000224817
0
1
2
3
4
-4.8
-5.2
4.6
5.1
5.9
5.2
-2.7
-2.6
-3.2
Bias
 Input (0,1)
 Node 4 activation is (5.9  0.153269 + 5.2  0.000224817 -2.7 )=
0.923992
0
1
2
3
4
-4.8
-5.2
4.6
5.1
5.9
5.2
-2.7
-2.6
-3.2
Bias
 Density Plot of Output
 Network can learn a non-linearly separated set of outputs.
 Need to map output (real value) into binary values.
 Weights are determined by training
 Back-propagation:
 On given input, compare actual output to desired output.
 Adjust weights to output nodes.
 Work backwards through the various layers
 Start out with initial random weights
 Best to keep weights close to zero (<<10)
 Weights are determined by training
 Need a training set
 Should be representative of the problem
 During each training epoch:
 Submit training set element as input
 Calculate the error for the output neurons
 Calculate average error during epoch
 Adjust weights
 Error is the mean square of differences in output layer
2
1
)
)
(
)
(
(
2
1
)
( 



K
k
k
k x
t
x
y
x
E



y – observed output
t – target output
 Error of training epoch is the average of all errors.
 Update weights and thresholds using
 Weights
 Bias
  is a possibly time-dependent factor that should prevent
overcorrection
k
k
k
jk
k
j
k
j
x
E
w
x
E
w
w















)
(
)
(
)
(
)
(
,
,


Using a sigmoid function, we get
 Logistics function  has derivative ’(t) = (t)(1-
(t))
)
)(
net
(
'
)
(
j
j
j
j
j
j
jk
y
t
f
y
w
x
E









Start out with
random, small
weights
x1 x2 y
0 0 0.687349
0 1 0.667459
1 0 0.698070
1 1 0.676727
0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
x1 x2 y Error
0 0 0.69 0.472448
0 1 0.67 0.110583
1 0 0.70 0.0911618
1 1 0.68 0.457959
0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
Average Error is 0.283038
x1 x2 y Error
0 0 0.69 0.472448
0 1 0.67 0.110583
1 0 0.70 0.0911618
1 1 0.68 0.457959
0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
Average Error is 0.283038
 Calculate the derivative of the error with respect to the
weights and bias into the output layer neurons
0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
New weights going into node 4
We do this for all training inputs,
then average out the changes
net4 is the weighted sum of input
going into neuron 4:
net4(0,0)= 0.787754
net4(0,1)= 0.696717
net4(1,0)= 0.838124
net4(1,1)= 0.73877
0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
New weights going into node 4
We calculate the derivative of the
activation function at the point given
by the net-input.
Recall our cool formula
’(t) = (t)(1- (t))
’( net4(0,0)) = ’( 0.787754) =
0.214900
’( net4(0,1)) = ’( 0.696717) =
0.221957
’( net4(1,0)) = ’( 0.838124) =
0.210768
’( net (1,1)) = ’( 0.738770) =
)
)(
net
(
'
)
(
j
j
j
j
j
j
jk
y
t
f
y
w
x
E









0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
New weights going into node 4
We now obtain  values for each input
separately:
Input 0,0:
4= ’( net4(0,0)) *(0-y4(0,0)) = -0.152928
Input 0,1:
4= ’( net4(0,1)) *(1-y4(0,1)) =
0.0682324
Input 1,0:
4= ’( net4(1,0)) *(1-y4(1,0)) =
0.0593889
Input 1,1:
4= ’( net4(1,1)) *(0-y4(1,1)) = -
0.153776
)
)(
net
(
'
)
(
j
j
j
j
j
j
jk
y
t
f
y
w
x
E









0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
New weights going into node 4
Average: 4 = -0.0447706
We can now update the weights going
into node 4:
Let’s call: Eji the derivative of the error
function with respect to the weight
going from neuron i into neuron j.
We do this for every possible input:
E4,2 = - output(neuron(2)* 4
For (0,0): E4,2 = 0.0577366
For (0,1): E4,2 = -0.0424719
For(1,0): E4,2 = -0.0159721
For(1,1): E4,2 = 0.0768878
Average is 0.0190451
)
)(
net
(
'
)
(
j
j
j
j
j
j
jk
y
t
f
y
w
x
E









0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
New weight from 2 to 4 is now going to
be 0.1190451.
)
)(
net
(
'
)
(
j
j
j
j
j
j
jk
y
t
f
y
w
x
E









0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
New weights going into node 4
For (0,0): E4,3 = 0.0411287
For (0,1): E4,3 = -0.0341162
For(1,0): E4,3 = -0.0108341
For(1,1): E4,3 = 0.0580565
Average is 0.0135588
New weight is -0.486441
)
)(
net
(
'
)
(
j
j
j
j
j
j
jk
y
t
f
y
w
x
E









0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
New weights going into node 4:
We also need to change the bias node
For (0,0): E4,B = 0.0411287
For (0,1): E4,B = -0.0341162
For(1,0): E4,B = -0.0108341
For(1,1): E4,B = 0.0580565
Average is 0.0447706
New weight is 1.0447706
 We now have adjusted all the weights into the output layer.
 Next, we adjust the hidden layer
 The target output is given by the delta values of the output layer
 More formally:
 Assume that j is a hidden neuron
 Assume that k is the delta-value for an output neuron k.
 While the example has only one output neuron, most ANN have more.
When we sum over k, this means summing over all output neurons.
 wkj is the weight from neuron j into neuron k
j
i
ji
k
kj
k
j
j
y
w
E
w
δ








  )
(
)
net
(
'
j
i
ji
k
kj
k
j
j
y
w
E
w
δ








  )
(
)
net
(
'
0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
We now calculate the updates to the
weights of neuron 2.
First, we calculate the net-input into 2.
This is really simple because it is just a
linear functions of the arguments x1
and x2
net2 = -0.5 x1 + x2 - 0.5
We obtain
2 (0,0) = - 0.00359387
2(0,1) = 0.00160349
2(1,0) = 0.00116766
2(1,1) = - 0.00384439
j
i
ji
k
kj
k
j
j
y
w
E
w
δ








  )
(
)
net
(
'
0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
Call E20 the derivative of E with respect
to w20. We use the output activation for
the neurons in the previous layer
(which happens to be the input layer)
E20 (0,0) = - (0)2 (0,0) = 0.00179694
E20(0,1) = 0.00179694
E20(1,0) = -0.000853626
E20(1,1) = 0.00281047
The average is 0.00073801 and the
new weight is -0.499262
j
i
ji
k
kj
k
j
j
y
w
E
w
δ








  )
(
)
net
(
'
0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
Call E21 the derivative of E with respect
to w21. We use the output activation for
the neurons in the previous layer
(which happens to be the input layer)
E21 (0,0) = - (1)2 (0,0) = 0.00179694
E21(0,1) = -0.00117224
E21(1,0) = -0.000583829
E21(1,1) = 0.00281047
The average is 0.000712835 and the
new weight is 1.00071
j
i
ji
k
kj
k
j
j
y
w
E
w
δ








  )
(
)
net
(
'
0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
Call E2B the derivative of E with respect
to w2B. Bias output is always -0.5
E2B (0,0) = - -0.5 2 (0,0) =
0.00179694
E2B(0,1) = -0.00117224
E2B(1,0) = -0.000583829
E2B(1,1) = 0.00281047
The average is 0.00058339 and the
new weight is -0.499417
j
i
ji
k
kj
k
j
j
y
w
E
w
δ








  )
(
)
net
(
'
0
1
2
3
4
-0,5
1
1
-0.5
0.1
-0.5
1
-0.5
-1
Bias
We now calculate the updates to the
weights of neuron 3.
…
 ANN Back-propagation is an empirical algorithm
 XOR is too simple an example, since quality of ANN is
measured on a finite sets of inputs.
 More relevant are ANN that are trained on a training set
and unleashed on real data
Need to measure effectiveness of training
 Need training sets
 Need test sets.
There can be no interaction between test
sets and training sets.
 Example of a Mistake:
 Train ANN on training set.
 Test ANN on test set.
 Results are poor.
 Go back to training ANN.
 After this, there is no assurance that ANN will work well in
practice.
 In a subtle way, the test set has become part of the training
set.
 Convergence
 ANN back propagation uses gradient decent.
 Naïve implementations can
 overcorrect weights
 undercorrect weights
 In either case, convergence can be poor
 Stuck in the wrong place
 ANN starts with random weights and improves them
 If improvement stops, we stop algorithm
 No guarantee that we found the best set of weights
 Could be stuck in a local minimum
 Overtraining
 An ANN can be made to work too well on a training set
 But loose performance on test sets
Training
set
Test set
Performanc
e
Training time
 Overtraining
 Assume we want to separate the red from the green dots.
 Eventually, the network will learn to do well in the training
case
 But have learnt only the particularities of our training set
 Overtraining
 Improving Convergence
 Many Operations Research Tools apply
 Simulated annealing
 Sophisticated gradient descent
 ANN is a largely empirical study
 “Seems to work in almost all cases that we know about”
 Known to be statistical pattern analysis
Number of layers
 Apparently, three layers is almost always good
enough and better than four layers.
 Also: fewer layers are faster in execution and
training
How many hidden nodes?
 Many hidden nodes allow to learn more
complicated patterns
 Because of overtraining, almost always best to
set the number of hidden nodes too low and
then increase their numbers.
 Interpreting Output
 ANN’s output neurons do not give binary values.
 Good or bad
 Need to define what is an accept.
 Can indicate n degrees of certainty with n-1 output neurons.
 Number of firing output neurons is degree of certainty
 Pattern recognition
 Network attacks
 Breast cancer
 …
 handwriting recognition
 Pattern completion
 Auto-association
 ANN trained to reproduce input as output
 Noise reduction
 Compression
 Finding anomalies
 Time Series Completion
 ANNs can do some things really well
 They lack in structure found in most natural neural
networks
 phi – activation function
 phid – derivative of activation function
Forward Propagation:
 Input nodes i, given input xi:
foreach inputnode i
outputi = xi
 Hidden layer nodes j
foreach hiddenneuron j
outputj = i phi(wjioutputi)
 Output layer neurons k
foreach outputneuron k
outputk = k phi(wkjoutputj)
ActivateLayer(input,output)
foreach i inputneuron
calculate outputi
foreach j hiddenneuron
calculate outputj
foreach k hiddenneuron
calculate outputk
output = {outputk}
 Output Error
Error() {
foreach input in InputSet
Errorinput = k output neuron (targetk-
outputk)2
return Average(Errorinput,InputSet)
 Gradient Calculation
 We calculate the gradient of the error with respect to a given
weight wkj.
 The gradient is the average of the gradients for all inputs.
 Calculation proceeds from the output layer to the hidden layer
k
j
kj
k
k
k
k
W
E











output
)
output
target
(
)
net
(
'
For each output neuron k calculate:
For each output neuron k calculate and
hidden layer neuron j calculate:
 


 k kj
k
j
j W


 )
net
(
'
j
i
ji
W
E






output
For each hidden neuron j calculate:
For each hidden neuron j and each input
neuron i calculate:
 These calculations were done for a single input.
 Now calculate the average gradient over all inputs (and for
all weights).
 You also need to calculate the gradients for the bias
weights and average them.
Naïve back-propagation code:
 Initialize weights to a small random value
(between -1 and 1)
 For a maximum number of iterations do
 Calculate average error for all input. If error is smaller
than tolerance, exit.
 For each input, calculate the gradients for all weights,
including bias weights and average them.
 If length of gradient vector is smaller than a small value,
then stop.
 Otherwise:
 Modify all weights by adding a negative multiple of the
gradient to the weights.
 This naïve algorithm has problems with convergence and
should only be used for toy problems.
Artificial Neural Networks for footwear industry

More Related Content

Similar to Artificial Neural Networks for footwear industry

Final Semester Project
Final Semester ProjectFinal Semester Project
Final Semester Project
Debraj Paul
 
Fuzzy Logic Final Report
Fuzzy Logic Final ReportFuzzy Logic Final Report
Fuzzy Logic Final Report
Shikhar Agarwal
 
005 characterization-of-yarn-diameter-measured-on-different
005 characterization-of-yarn-diameter-measured-on-different005 characterization-of-yarn-diameter-measured-on-different
005 characterization-of-yarn-diameter-measured-on-different
aqeel ahmed
 
Artificial Neural Networks (ANNS) For Prediction of California Bearing Ratio ...
Artificial Neural Networks (ANNS) For Prediction of California Bearing Ratio ...Artificial Neural Networks (ANNS) For Prediction of California Bearing Ratio ...
Artificial Neural Networks (ANNS) For Prediction of California Bearing Ratio ...
IJMER
 
Neural network
Neural networkNeural network
Neural network
Saddam Hussain
 
11.comparison of hybrid pso sa algorithm and genetic algorithm for classifica...
11.comparison of hybrid pso sa algorithm and genetic algorithm for classifica...11.comparison of hybrid pso sa algorithm and genetic algorithm for classifica...
11.comparison of hybrid pso sa algorithm and genetic algorithm for classifica...
Alexander Decker
 

Similar to Artificial Neural Networks for footwear industry (20)

Artificial Neural networks
Artificial Neural networksArtificial Neural networks
Artificial Neural networks
 
Neural Network Implementation Control Mobile Robot
Neural Network Implementation Control Mobile RobotNeural Network Implementation Control Mobile Robot
Neural Network Implementation Control Mobile Robot
 
G017513945
G017513945G017513945
G017513945
 
Neural network
Neural network Neural network
Neural network
 
Final Semester Project
Final Semester ProjectFinal Semester Project
Final Semester Project
 
International conference textiles and fashion
International conference textiles and fashionInternational conference textiles and fashion
International conference textiles and fashion
 
Fuzzy Logic Final Report
Fuzzy Logic Final ReportFuzzy Logic Final Report
Fuzzy Logic Final Report
 
Analysis of rejected ring cops in autoconer winding machine
Analysis of rejected ring cops in autoconer winding machineAnalysis of rejected ring cops in autoconer winding machine
Analysis of rejected ring cops in autoconer winding machine
 
005 characterization-of-yarn-diameter-measured-on-different
005 characterization-of-yarn-diameter-measured-on-different005 characterization-of-yarn-diameter-measured-on-different
005 characterization-of-yarn-diameter-measured-on-different
 
Wearable textile antenna
Wearable textile antennaWearable textile antenna
Wearable textile antenna
 
AntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptxAntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptx
 
Artificial Neural Networks (ANNS) For Prediction of California Bearing Ratio ...
Artificial Neural Networks (ANNS) For Prediction of California Bearing Ratio ...Artificial Neural Networks (ANNS) For Prediction of California Bearing Ratio ...
Artificial Neural Networks (ANNS) For Prediction of California Bearing Ratio ...
 
Neural network
Neural networkNeural network
Neural network
 
An Energy efficient cluster head selection in wireless sensor networks using ...
An Energy efficient cluster head selection in wireless sensor networks using ...An Energy efficient cluster head selection in wireless sensor networks using ...
An Energy efficient cluster head selection in wireless sensor networks using ...
 
APPLICATIONS OF WAVELET TRANSFORMS AND NEURAL NETWORKS IN EARTHQUAKE GEOTECHN...
APPLICATIONS OF WAVELET TRANSFORMS AND NEURAL NETWORKS IN EARTHQUAKE GEOTECHN...APPLICATIONS OF WAVELET TRANSFORMS AND NEURAL NETWORKS IN EARTHQUAKE GEOTECHN...
APPLICATIONS OF WAVELET TRANSFORMS AND NEURAL NETWORKS IN EARTHQUAKE GEOTECHN...
 
11.comparison of hybrid pso sa algorithm and genetic algorithm for classifica...
11.comparison of hybrid pso sa algorithm and genetic algorithm for classifica...11.comparison of hybrid pso sa algorithm and genetic algorithm for classifica...
11.comparison of hybrid pso sa algorithm and genetic algorithm for classifica...
 
Digit recognition
Digit recognitionDigit recognition
Digit recognition
 
Automatic Fabric Defect Detection with a Wide-And-Compact Network
Automatic Fabric Defect Detection with a Wide-And-Compact NetworkAutomatic Fabric Defect Detection with a Wide-And-Compact Network
Automatic Fabric Defect Detection with a Wide-And-Compact Network
 
Fabric Defect Detection by using Neural Network technique
Fabric Defect Detection by using Neural Network techniqueFabric Defect Detection by using Neural Network technique
Fabric Defect Detection by using Neural Network technique
 
Comparison of hybrid pso sa algorithm and genetic algorithm for classification
Comparison of hybrid pso sa algorithm and genetic algorithm for classificationComparison of hybrid pso sa algorithm and genetic algorithm for classification
Comparison of hybrid pso sa algorithm and genetic algorithm for classification
 

More from Tanvir Moin

Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike Moin
Tanvir Moin
 

More from Tanvir Moin (20)

Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike Moin
 
Fundamentals of Wastewater Treatment Plant
Fundamentals of Wastewater Treatment PlantFundamentals of Wastewater Treatment Plant
Fundamentals of Wastewater Treatment Plant
 
Basic Principle of Electrochemical Sensor
Basic Principle of  Electrochemical SensorBasic Principle of  Electrochemical Sensor
Basic Principle of Electrochemical Sensor
 
Aerated Lagoons
Aerated Lagoons Aerated Lagoons
Aerated Lagoons
 
Assessing and predicting land use/land cover and land surface temperature usi...
Assessing and predicting land use/land cover and land surface temperature usi...Assessing and predicting land use/land cover and land surface temperature usi...
Assessing and predicting land use/land cover and land surface temperature usi...
 
SOLID WASTE MANAGEMENT IN THE PHARMACEUTICAL INDUSTRY
SOLID WASTE MANAGEMENT IN THE PHARMACEUTICAL INDUSTRYSOLID WASTE MANAGEMENT IN THE PHARMACEUTICAL INDUSTRY
SOLID WASTE MANAGEMENT IN THE PHARMACEUTICAL INDUSTRY
 
Wastewater Characteristics in the Pharmaceutical Industry
Wastewater Characteristics in the Pharmaceutical IndustryWastewater Characteristics in the Pharmaceutical Industry
Wastewater Characteristics in the Pharmaceutical Industry
 
Pharmaceuticals Industry
Pharmaceuticals IndustryPharmaceuticals Industry
Pharmaceuticals Industry
 
UNACCOUNTED FOR WATER IN URBAN WATER SUPPLY SYSTEM FOR DHAKA CITY
UNACCOUNTED FOR WATER IN URBAN WATER SUPPLY SYSTEM FOR DHAKA CITY UNACCOUNTED FOR WATER IN URBAN WATER SUPPLY SYSTEM FOR DHAKA CITY
UNACCOUNTED FOR WATER IN URBAN WATER SUPPLY SYSTEM FOR DHAKA CITY
 
Overview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryOverview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear Industry
 
Statistical Evaluation of PM2.5 Induced Air Pollution of Dhaka City during Co...
Statistical Evaluation of PM2.5 Induced Air Pollution of Dhaka City during Co...Statistical Evaluation of PM2.5 Induced Air Pollution of Dhaka City during Co...
Statistical Evaluation of PM2.5 Induced Air Pollution of Dhaka City during Co...
 
An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...
An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...
An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...
 
Fabric Manufacturing Technology for Shoe Upper
Fabric Manufacturing Technology for Shoe UpperFabric Manufacturing Technology for Shoe Upper
Fabric Manufacturing Technology for Shoe Upper
 
YARN MANUFACTURING TECHNOLOGY FOR SHOE UPPER
YARN MANUFACTURING TECHNOLOGY FOR SHOE UPPERYARN MANUFACTURING TECHNOLOGY FOR SHOE UPPER
YARN MANUFACTURING TECHNOLOGY FOR SHOE UPPER
 
DISPOSAL OF NUCLEAR WASTE
DISPOSAL OF NUCLEAR WASTEDISPOSAL OF NUCLEAR WASTE
DISPOSAL OF NUCLEAR WASTE
 
An Overview of Machine Learning
An Overview of Machine LearningAn Overview of Machine Learning
An Overview of Machine Learning
 
Shoes & Shoegear
Shoes & ShoegearShoes & Shoegear
Shoes & Shoegear
 
Fabric Structure
Fabric StructureFabric Structure
Fabric Structure
 
Nanomaterial Synthesis Method
Nanomaterial Synthesis MethodNanomaterial Synthesis Method
Nanomaterial Synthesis Method
 
SPECTROSCOPY
SPECTROSCOPYSPECTROSCOPY
SPECTROSCOPY
 

Recently uploaded

Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdf
AbrahamGadissa
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
Kamal Acharya
 

Recently uploaded (20)

RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
 
Toll tax management system project report..pdf
Toll tax management system project report..pdfToll tax management system project report..pdf
Toll tax management system project report..pdf
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientist
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltage
 
Pharmacy management system project report..pdf
Pharmacy management system project report..pdfPharmacy management system project report..pdf
Pharmacy management system project report..pdf
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Peek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdfPeek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdf
 
A case study of cinema management system project report..pdf
A case study of cinema management system project report..pdfA case study of cinema management system project report..pdf
A case study of cinema management system project report..pdf
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docxThe Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
 

Artificial Neural Networks for footwear industry

  • 2.  Artificial Neural Networks (ANN)  Information processing paradigm inspired by biological nervous systems  ANN is composed of a system of neurons connected by synapses  ANN learn by example  Adjust synaptic connections between neurons
  • 3. 1943: McCulloch and Pitts model neural networks based on their understanding of neurology.  Neurons embed simple logic functions:  a or b  a and b 1950s:  Farley and Clark  IBM group that tries to model biological behavior  Consult neuro-scientists at McGill, whenever stuck  Rochester, Holland, Haibit and Duda
  • 4. Perceptron (Rosenblatt 1958)  Three layer system:  Input nodes  Output node  Association layer  Can learn to connect or associate a given input to a random output unit Minsky and Papert  Showed that a single layer perceptron cannot learn the XOR of two binary inputs  Lead to loss of interest (and funding) in the field
  • 5. Perceptron (Rosenblatt 1958)  Association units A1, A2, … extract features from user input  Output is weighted and associated  Function fires if weighted sum of input exceeds a threshold.
  • 6. Back-propagation learning method (Werbos 1974)  Three layers of neurons  Input, Output, Hidden  Better learning rule for generic three layer networks  Regenerates interest in the 1980s Successful applications in medicine, marketing, risk management, … (1990) In need for another breakthrough.
  • 7. Applications of Artificial Neural Network in Footwear Industry • An ANN is used for the prediction of copolymer composition very correctly, as a function of reaction conditions and conversions. • Classification of the leather fibers is one of the most typical problems. It has been resolved successfully using ANNs. • ANN is used in leather and cotton grading more perfectly. • ANNs have supported to identify the production control parameters and the prediction of the properties of the melt spun fibers in the case of synthetic fibers. • ANNs have been used in conjunction with NIR` spectroscopy for the identification of the leather and textile fibers.
  • 8. Applications of Artificial Neural Network in Footwear Industry Application of ANN in leather and fabrics for shoe upper is summarized below: • ANN is used to inspect leather and fabric for the detection of faults. • Fabrics can be engineered either by weaving, knitting or bonding. Neural networks are successfully implemented in all three to optimise the input parameters. • The detection and recognition of the patterns on a leather and fabric are of the same complex category of problems and thus resolved by the implementation of ANNs. • ANN is used to the prediction of the leather and fabric drape. • By the application of ANNs leather and fabric handle become increase. • ANNs in combination with fuzzy logic have been used in the case of the prediction of the sensory properties of the leather and fabrics. • ANNs have been used for the prediction of the tensile strength and for the initial stress-strain curve of the leather and fabrics. • Shear stiffness and compression properties of the worsted leather and fabrics has been successfully modelled.
  • 9. Applications of Artificial Neural Network in Footwear Industry • Prediction of the spirality of the relaxed knitted fabrics as well as knit global quality and subjective assessment of the knit fabrics have been implemented using ANNs. • Prediction of the thermal resistance and the thermal conductivity of the leather and fabrics have been realized with the help of ANNs. • Moisture and heat transfer in knitted fabrics have been studied with ANN modelling successfully. • Engineering of leather and fabrics used in safety and protection applications is supported by ANNs. • Prediction of the leather and fabrics end use is also possible via ANN method. • Optimization of the application of a repellent coating has also been approached by the ANN model. • Colour measurement, evaluation, comparison and prediction are major actions in the dyeing and finishing field of the leather and textile process. It is done by ANNs.
  • 10. Applications of Artificial Neural Network in Footwear Industry • Prediction of bursting strength using ANNs for woven and knitted fabrics has been achieved with satisfactory results. • Permeability of the woven fabrics has been modelled using ANNs as well as, the impact permeability has been studied and the quality of the neural models has been assessed. • Pilling propensity of the leather and fabrics has been predicted and the pilling of the leather and leather and fabrics has been evaluated. • The presence of fuzz fibers has been modelled by ANN. • ANN is used in evaluating wrinkled leather and leather and fabrics with image analysis.
  • 11. Applications of Artificial Neural Network in Footwear Industry Application of ANNs in footwear: A large variety of ANN approaches for decision-making problems in the footwear industry have been proposed in the last two decades, three major leading areas are summarized below: • Resolving prediction problem: RGI involves various predictions during production process, like; leather and fabric manufacturing performance, sewing thread consumption, fashion sensory comfort, cutting time, footwear sales, etc. • Resolving classification problems: It involves classification at various levels, just for an example; leather and fabric online classification, leather and fabric defect classification, leather and fabric end-use classification and seam feature classification etc. • Model identification problems.
  • 12. Applications of Artificial Neural Network in Footwear Industry • The thread consumption is predicted via an ANN model. • The seam puckering is evaluated and the sewing thread is optimized through ANN models, respectively. • The prediction of the sewing performance is also possible using ANNs. • Prediction of the performance of the leather and fabrics in footwear manufacturing and fit footwear design has been realized based on ANN systems.
  • 13.  Promises  Combine speed of silicon with proven success of carbon  artificial brains
  • 15. Neuron collects signals from dendrites Sends out spikes of electrical activity through an axon, which splits into thousands of branches. At end of each brand, a synapses converts activity into either exciting or inhibiting activity of a dendrite at another neuron. Neuron fires when exciting activity surpasses inhibitory activity Learning changes the effectiveness of the synapses
  • 18.
  • 19.  Bias Nodes  Add one node to each layer that has constant output  Forward propagation  Calculate from input layer to output layer  For each neuron:  Calculate weighted average of input  Calculate activation function
  • 20. Firing Rules:  Threshold rules:  Calculate weighted average of input  Fire if larger than threshold  Perceptron rule  Calculate weighted average of input input  Output activation level is               0 0 2 1 0 2 1 1 ) (      
  • 21.  Firing Rules: Sigmoid functions:  Hyperbolic tangent function  Logistic activation function   ) exp( 1 ) exp( 1 ) 2 / tanh(                      exp 1 1
  • 22.
  • 23. Apply input vector X to layer of neurons. Calculate  where Xi is the activation of previous layer neuron i  Wji is the weight of going from node i to node j  p is the number of neurons in the previous layer Calculate output activation   ) ( 1 Threshold X W n V i p i ji j    )) ( exp( 1 1 ) ( n V n Y j j   
  • 24. Example: ADALINE Neural Network  Calculates and of inputs 0 1 2 Bias Node 3 w0,3=.6 w1,3=.6 w2,3=-.9 threshold function is step function
  • 25. Example: Three layer network  Calculates xor of inputs 0 1 2 3 4 -4.8 -5.2 4.6 5.1 5.9 5.2 -2.7 -2.6 -3.2 Bias
  • 27.  Input (0,0)  Node 2 activation is (-4.8  0+4.6  0 - 2.6)= 0.0691 0 1 2 3 4 -4.8 -5.2 4.6 5.1 5.9 5.2 -2.7 -2.6 -3.2 Bias
  • 28.  Input (0,0)  Node 3 activation is (5.1  0 - 5.2  0 - 3.2)= 0.0392 0 1 2 3 4 -4.8 -5.2 4.6 5.1 5.9 5.2 -2.7 -2.6 -3.2 Bias
  • 29.  Input (0,0)  Node 4 activation is (5.9  0.069 + 5.2  0.069 – 2.7)= 0.110227 0 1 2 3 4 -4.8 -5.2 4.6 5.1 5.9 5.2 -2.7 -2.6 -3.2 Bias
  • 30. Input (0,1)  Node 2 activation is (4.6 -2.6)= 0.153269 0 1 2 3 4 -4.8 -5.2 4.6 5.1 5.9 5.2 -2.7 -2.6 -3.2 Bias
  • 31. Input (0,1)  Node 3 activation is (-5.2 -3.2)= 0.000224817 0 1 2 3 4 -4.8 -5.2 4.6 5.1 5.9 5.2 -2.7 -2.6 -3.2 Bias
  • 32.  Input (0,1)  Node 4 activation is (5.9  0.153269 + 5.2  0.000224817 -2.7 )= 0.923992 0 1 2 3 4 -4.8 -5.2 4.6 5.1 5.9 5.2 -2.7 -2.6 -3.2 Bias
  • 33.  Density Plot of Output
  • 34.
  • 35.  Network can learn a non-linearly separated set of outputs.  Need to map output (real value) into binary values.
  • 36.  Weights are determined by training  Back-propagation:  On given input, compare actual output to desired output.  Adjust weights to output nodes.  Work backwards through the various layers  Start out with initial random weights  Best to keep weights close to zero (<<10)
  • 37.  Weights are determined by training  Need a training set  Should be representative of the problem  During each training epoch:  Submit training set element as input  Calculate the error for the output neurons  Calculate average error during epoch  Adjust weights
  • 38.  Error is the mean square of differences in output layer 2 1 ) ) ( ) ( ( 2 1 ) (     K k k k x t x y x E    y – observed output t – target output
  • 39.  Error of training epoch is the average of all errors.
  • 40.  Update weights and thresholds using  Weights  Bias   is a possibly time-dependent factor that should prevent overcorrection k k k jk k j k j x E w x E w w                ) ( ) ( ) ( ) ( , ,  
  • 41. Using a sigmoid function, we get  Logistics function  has derivative ’(t) = (t)(1- (t)) ) )( net ( ' ) ( j j j j j j jk y t f y w x E         
  • 42. Start out with random, small weights x1 x2 y 0 0 0.687349 0 1 0.667459 1 0 0.698070 1 1 0.676727 0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias
  • 43. x1 x2 y Error 0 0 0.69 0.472448 0 1 0.67 0.110583 1 0 0.70 0.0911618 1 1 0.68 0.457959 0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias Average Error is 0.283038
  • 44. x1 x2 y Error 0 0 0.69 0.472448 0 1 0.67 0.110583 1 0 0.70 0.0911618 1 1 0.68 0.457959 0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias Average Error is 0.283038
  • 45.  Calculate the derivative of the error with respect to the weights and bias into the output layer neurons
  • 46. 0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias New weights going into node 4 We do this for all training inputs, then average out the changes net4 is the weighted sum of input going into neuron 4: net4(0,0)= 0.787754 net4(0,1)= 0.696717 net4(1,0)= 0.838124 net4(1,1)= 0.73877
  • 47. 0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias New weights going into node 4 We calculate the derivative of the activation function at the point given by the net-input. Recall our cool formula ’(t) = (t)(1- (t)) ’( net4(0,0)) = ’( 0.787754) = 0.214900 ’( net4(0,1)) = ’( 0.696717) = 0.221957 ’( net4(1,0)) = ’( 0.838124) = 0.210768 ’( net (1,1)) = ’( 0.738770) =
  • 48. ) )( net ( ' ) ( j j j j j j jk y t f y w x E          0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias New weights going into node 4 We now obtain  values for each input separately: Input 0,0: 4= ’( net4(0,0)) *(0-y4(0,0)) = -0.152928 Input 0,1: 4= ’( net4(0,1)) *(1-y4(0,1)) = 0.0682324 Input 1,0: 4= ’( net4(1,0)) *(1-y4(1,0)) = 0.0593889 Input 1,1: 4= ’( net4(1,1)) *(0-y4(1,1)) = - 0.153776
  • 49. ) )( net ( ' ) ( j j j j j j jk y t f y w x E          0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias New weights going into node 4 Average: 4 = -0.0447706 We can now update the weights going into node 4: Let’s call: Eji the derivative of the error function with respect to the weight going from neuron i into neuron j. We do this for every possible input: E4,2 = - output(neuron(2)* 4 For (0,0): E4,2 = 0.0577366 For (0,1): E4,2 = -0.0424719 For(1,0): E4,2 = -0.0159721 For(1,1): E4,2 = 0.0768878 Average is 0.0190451
  • 51. ) )( net ( ' ) ( j j j j j j jk y t f y w x E          0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias New weights going into node 4 For (0,0): E4,3 = 0.0411287 For (0,1): E4,3 = -0.0341162 For(1,0): E4,3 = -0.0108341 For(1,1): E4,3 = 0.0580565 Average is 0.0135588 New weight is -0.486441
  • 52. ) )( net ( ' ) ( j j j j j j jk y t f y w x E          0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias New weights going into node 4: We also need to change the bias node For (0,0): E4,B = 0.0411287 For (0,1): E4,B = -0.0341162 For(1,0): E4,B = -0.0108341 For(1,1): E4,B = 0.0580565 Average is 0.0447706 New weight is 1.0447706
  • 53.  We now have adjusted all the weights into the output layer.  Next, we adjust the hidden layer  The target output is given by the delta values of the output layer  More formally:  Assume that j is a hidden neuron  Assume that k is the delta-value for an output neuron k.  While the example has only one output neuron, most ANN have more. When we sum over k, this means summing over all output neurons.  wkj is the weight from neuron j into neuron k j i ji k kj k j j y w E w δ           ) ( ) net ( '
  • 54. j i ji k kj k j j y w E w δ           ) ( ) net ( ' 0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias We now calculate the updates to the weights of neuron 2. First, we calculate the net-input into 2. This is really simple because it is just a linear functions of the arguments x1 and x2 net2 = -0.5 x1 + x2 - 0.5 We obtain 2 (0,0) = - 0.00359387 2(0,1) = 0.00160349 2(1,0) = 0.00116766 2(1,1) = - 0.00384439
  • 55. j i ji k kj k j j y w E w δ           ) ( ) net ( ' 0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias Call E20 the derivative of E with respect to w20. We use the output activation for the neurons in the previous layer (which happens to be the input layer) E20 (0,0) = - (0)2 (0,0) = 0.00179694 E20(0,1) = 0.00179694 E20(1,0) = -0.000853626 E20(1,1) = 0.00281047 The average is 0.00073801 and the new weight is -0.499262
  • 56. j i ji k kj k j j y w E w δ           ) ( ) net ( ' 0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias Call E21 the derivative of E with respect to w21. We use the output activation for the neurons in the previous layer (which happens to be the input layer) E21 (0,0) = - (1)2 (0,0) = 0.00179694 E21(0,1) = -0.00117224 E21(1,0) = -0.000583829 E21(1,1) = 0.00281047 The average is 0.000712835 and the new weight is 1.00071
  • 57. j i ji k kj k j j y w E w δ           ) ( ) net ( ' 0 1 2 3 4 -0,5 1 1 -0.5 0.1 -0.5 1 -0.5 -1 Bias Call E2B the derivative of E with respect to w2B. Bias output is always -0.5 E2B (0,0) = - -0.5 2 (0,0) = 0.00179694 E2B(0,1) = -0.00117224 E2B(1,0) = -0.000583829 E2B(1,1) = 0.00281047 The average is 0.00058339 and the new weight is -0.499417
  • 59.  ANN Back-propagation is an empirical algorithm
  • 60.  XOR is too simple an example, since quality of ANN is measured on a finite sets of inputs.  More relevant are ANN that are trained on a training set and unleashed on real data
  • 61. Need to measure effectiveness of training  Need training sets  Need test sets. There can be no interaction between test sets and training sets.  Example of a Mistake:  Train ANN on training set.  Test ANN on test set.  Results are poor.  Go back to training ANN.  After this, there is no assurance that ANN will work well in practice.  In a subtle way, the test set has become part of the training set.
  • 62.  Convergence  ANN back propagation uses gradient decent.  Naïve implementations can  overcorrect weights  undercorrect weights  In either case, convergence can be poor  Stuck in the wrong place  ANN starts with random weights and improves them  If improvement stops, we stop algorithm  No guarantee that we found the best set of weights  Could be stuck in a local minimum
  • 63.  Overtraining  An ANN can be made to work too well on a training set  But loose performance on test sets Training set Test set Performanc e Training time
  • 64.  Overtraining  Assume we want to separate the red from the green dots.  Eventually, the network will learn to do well in the training case  But have learnt only the particularities of our training set
  • 66.  Improving Convergence  Many Operations Research Tools apply  Simulated annealing  Sophisticated gradient descent
  • 67.  ANN is a largely empirical study  “Seems to work in almost all cases that we know about”  Known to be statistical pattern analysis
  • 68. Number of layers  Apparently, three layers is almost always good enough and better than four layers.  Also: fewer layers are faster in execution and training How many hidden nodes?  Many hidden nodes allow to learn more complicated patterns  Because of overtraining, almost always best to set the number of hidden nodes too low and then increase their numbers.
  • 69.  Interpreting Output  ANN’s output neurons do not give binary values.  Good or bad  Need to define what is an accept.  Can indicate n degrees of certainty with n-1 output neurons.  Number of firing output neurons is degree of certainty
  • 70.  Pattern recognition  Network attacks  Breast cancer  …  handwriting recognition  Pattern completion  Auto-association  ANN trained to reproduce input as output  Noise reduction  Compression  Finding anomalies  Time Series Completion
  • 71.  ANNs can do some things really well  They lack in structure found in most natural neural networks
  • 72.  phi – activation function  phid – derivative of activation function
  • 73. Forward Propagation:  Input nodes i, given input xi: foreach inputnode i outputi = xi  Hidden layer nodes j foreach hiddenneuron j outputj = i phi(wjioutputi)  Output layer neurons k foreach outputneuron k outputk = k phi(wkjoutputj)
  • 74. ActivateLayer(input,output) foreach i inputneuron calculate outputi foreach j hiddenneuron calculate outputj foreach k hiddenneuron calculate outputk output = {outputk}
  • 75.  Output Error Error() { foreach input in InputSet Errorinput = k output neuron (targetk- outputk)2 return Average(Errorinput,InputSet)
  • 76.  Gradient Calculation  We calculate the gradient of the error with respect to a given weight wkj.  The gradient is the average of the gradients for all inputs.  Calculation proceeds from the output layer to the hidden layer
  • 77. k j kj k k k k W E            output ) output target ( ) net ( ' For each output neuron k calculate: For each output neuron k calculate and hidden layer neuron j calculate:
  • 78.      k kj k j j W    ) net ( ' j i ji W E       output For each hidden neuron j calculate: For each hidden neuron j and each input neuron i calculate:
  • 79.  These calculations were done for a single input.  Now calculate the average gradient over all inputs (and for all weights).  You also need to calculate the gradients for the bias weights and average them.
  • 80. Naïve back-propagation code:  Initialize weights to a small random value (between -1 and 1)  For a maximum number of iterations do  Calculate average error for all input. If error is smaller than tolerance, exit.  For each input, calculate the gradients for all weights, including bias weights and average them.  If length of gradient vector is smaller than a small value, then stop.  Otherwise:  Modify all weights by adding a negative multiple of the gradient to the weights.
  • 81.  This naïve algorithm has problems with convergence and should only be used for toy problems.