SlideShare a Scribd company logo
T H E UN I V E R S I T Y of TE X A S 
HE A L T H S C I E N C E CE N T E R A T HO U S T O N 
S C H O O L of HE A L T H I N F O R M A T I O N S C I E N C E S 
Artificial Neural Networks and 
Pattern Recognition 
For students of HI 5323 
“Image Processing” 
Willy Wriggers, Ph.D. 
School of Health Information Sciences 
http://biomachina.org/courses/processing/13.html
Biology
What are Neural Networks? 
• Models of the brain and nervous system 
• Highly parallel 
ƒ Process information much more like the brain than a serial computer 
• Learning 
• Very simple principles 
• Very complex behaviours 
• Applications 
ƒ As powerful problem solvers 
ƒ As biological models 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Neuro- 
Physiological 
Background 
• 10 billion neurons in 
human cortex 
• 60 trillion synapses 
• In first two years from birth 
~1 million synapses / sec. 
formed 
pyramidal cell
Organizing Principle
Various Types of Neurons
Neuron Models
Modeling the Neuron 
bias 
inputs 
h(w0 ,wi , xi ) y = f(h) 
y 
x1 w1 
xi 
wi 
xn wn 
1 
w0 f : activation function 
output 
h : combine wi & xi 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Artificial Neuron Anatomy
Common Activation Functions 
• Sigmoidal Function: 
nΣ 
y = f h= w0 ⋅1+ wi ⋅ xi 
• Radial Function, e.g.. Gaussian: 
• Linear Function 
i=1 
; ρ 
⎛ 
⎜ 
⎝ 
⎞ 
⎠ 
⎟ = 1 
1+ e 
−h 
ρ 
nΣ 
y = f h= xi −wi ( )2 
i=1 
;σ = w0 
⎛ 
⎜ 
⎝ 
⎞ 
⎠ 
⎟ = 1 
e 
2πσ 
− h2 
2σ 2 
nΣ 
y = w0 ⋅1+ wi ⋅ xi 
i=1 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Supervised Learning
Artificial Neural Networks 
• ANNs incorporate the two fundamental components of 
biological neural nets: 
1. Neurones (nodes) 
2. Synapses (weights) 
Input Output 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
“Pidgeon” ANNs 
• Pigeons as art experts (Watanabe et al. 1995) 
• Experiment: 
- Pigeon in Skinner box 
- Present paintings of two different artists (e.g. Chagall / Van Gogh) 
- Reward for pecking when presented a particular artist (e.g. Van Gogh) 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Training Set: 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt 
(etc…)
Predictive Power: 
• Pigeons were able to discriminate between Van Gogh and Chagall with 
95% accuracy (when presented with pictures they had been trained on) 
• Discrimination still 85% successful for previously unseen paintings of 
the artists. 
• Pigeons do not simply memorise the pictures 
• They can extract and recognise patterns (the ‘style’) 
• They generalise from the already seen to make predictions 
• This is what neural networks (biological and artificial) are good at 
(unlike conventional computer) 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Real ANN Applications 
• Recognition of hand-written letters 
• Predicting on-line the quality of welding spots 
• Identifying relevant documents in corpus 
• Visualizing high-dimensional space 
• Tracking on-line the position of robot arms 
• …etc 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
ANN Design 
1. Get a large amount of data: inputs and outputs 
2. Analyze data on the PC 
z Relevant inputs ? 
z Linear correlations (ANN necessary) ? 
z Transform and scale variables 
z Other useful preprocessing ? 
z Divide in 3 data sets: 
Training set 
Test set 
Validation set 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
ANN Design 
3. Set the ANN architecture: What type of ANN ? 
z Number of inputs, outputs ? 
z Number of hidden layers 
z Number of neurons 
z Learning schema « details » 
4. Tune/optimize internal parameters by presenting training data set to ANN 
5. Validate on test / validation dataset 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Main Types of ANN 
Supervised Learning: 
ƒ Feed-forward ANN 
- Multi-Layer Perceptron (with sigmoid hidden neurons) 
ƒ Recurrent Networks 
- Neurons are connected to self and others 
- Time delay of signal transfer 
- Multidirectional information flow 
Unsupervised Learning: 
ƒ Self-organizing ANN 
- Kohonen Maps 
- Vector Quantization 
- Neural Gas
Feed-Forward ANN 
• Information flow is unidirectional 
• Data is presented to Input layer 
• Passed on to Hidden Layer 
• Passed on to Output layer 
• Information is distributed 
• Information processing is parallel 
Internal representation (interpretation) of data 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Supervised Learning 
Training set: 
{(μxin, μtout); 
1 ≤ μ ≤ P} 
μ xout 
desired output 
(supervisor) μ t out 
μ xin 
error=μ xout −μ tout 
Typically: 
backprop. 
of errors 
- 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Important Properties of FFN 
• Assume 
ƒ g(x): bounded and sufficiently regular fct. 
ƒ FFN with 1 hidden layer of finite N neurons 
(Transfer function is identical for every neurons) 
• => FFN is an Universal Approximator of g(x) 
Theorem by Cybenko et al. in 1989 
In the sense of uniform approximation 
For arbitrary precision ε 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
• Assume 
Important Properties of FFN 
ƒ FFN as before 
(1 hidden layer of finite N neurons, non linear transfer function) 
ƒ Approximation precision ε 
• => #{wi} ~ # inputs 
Theorem by Barron in 1993 
ANN is more parsimonious in #{wi} than a linear approximator 
[linear approximator: #{wi} ~ exp(# inputs) ] 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Roughness of Output 
• Outputs depends of the whole set of 
weighted links {wij} 
• Example: output unit versus input 1 and 
input 2 for a 2*10*1 ANN with random 
weights 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Feeding Data Through the FNN 
(1 × 0.25) + (0.5 × (-1.5)) = 0.25 + (-0.75) = - 0.5 
0.3775 
1 
1 
0.5 = 
+ e 
Squashing: 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Feeding Data Through the FNN 
• Data is presented to the network in the form of activations in the input layer 
• Examples 
ƒ Pixel intensity (for pictures) 
ƒ Molecule concentrations (for artificial nose) 
ƒ Share prices (for stock market prediction) 
• Data usually requires preprocessing 
ƒ Analogous to senses in biology 
• How to represent more abstract data, e.g. a name? 
ƒ Choose a pattern, e.g. 
- 0-0-1 for “Chris” 
- 0-1-0 for “Becky” 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Training the Network 
How do we adjust the weights? 
• Backpropagation 
ƒ Requires training set (input / output pairs) 
ƒ Starts with small random weights 
ƒ Error is used to adjust weights (supervised learning) 
Æ Gradient descent on error landscape 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Backpropagation 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
• Advantages 
ƒ It works! 
ƒ Relatively fast 
• Downsides 
Backpropagation 
ƒ Requires a training set 
ƒ Can be slow to converge 
ƒ Probably not biologically realistic 
• Alternatives to Backpropagation 
ƒ Hebbian learning 
- Not successful in feed-forward nets 
ƒ Reinforcement learning 
- Only limited success in FFN 
ƒ Artificial evolution 
- More general, but can be even slower than backprop 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Applications of FFN 
ƒ Pattern recognition 
- Character recognition 
- Face Recognition 
ƒ Sonar mine/rock recognition (Gorman & Sejnowksi, 1988) 
ƒ Navigation of a car (Pomerleau, 1989) 
ƒ Stock-market prediction 
ƒ Pronunciation (NETtalk) 
(Sejnowksi & Rosenberg, 1987) 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Protein Secondary Structure Prediction 
(Holley-Karplus, Ph.D., etc): 
Supervised learning: 
ƒ Adjust weight vectors so 
output of network matches 
desired result 
α-helical 
coil 
amino acid sequence
Recurrent Networks 
• Feed forward networks: 
ƒ Information only flows one way 
ƒ One input pattern produces one output 
ƒ No sense of time (or memory of previous state) 
• Recurrency 
ƒ Nodes connect back to other nodes or themselves 
ƒ Information flow is multidirectional 
ƒ Sense of time and memory of previous state(s) 
• Biological nervous systems show high levels of recurrency (but feed-forward 
structures exists too) 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Elman Nets 
• Elman nets are feed forward networks with partial 
recurrency 
• Unlike feed forward nets, Elman nets have a memory or 
sense of time 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Elman Nets 
Classic experiment on language acquisition and processing (Elman, 1990) 
• Task 
ƒ Elman net to predict successive words in sentences. 
• Data 
ƒ Suite of sentences, e.g. 
- “The boy catches the ball.” 
- “The girl eats an apple.” 
ƒ Words are input one at a time 
• Representation 
ƒ Binary representation for each word, e.g. 
- 0-1-0-0-0 for “girl” 
• Training method 
ƒ Backpropagation 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Elman Nets 
Internal 
representation 
of words 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Hopfield Networks 
• Sub-type of recurrent neural nets 
ƒ Fully recurrent 
ƒ Weights are symmetric 
ƒ Nodes can only be on or off 
ƒ Random updating 
• Learning: Hebb rule (cells that fire together 
wire together) 
• Can recall a memory, if presented with a 
corrupt or incomplete version 
Æ auto-associative or 
content-addressable memory 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Hopfield Networks 
Task: store images with resolution of 20x20 pixels 
Æ Hopfield net with 400 nodes 
Memorise: 
1. Present image 
2. Apply Hebb rule (cells that fire together, wire together) 
- Increase weight between two nodes if both have same 
activity, otherwise decrease 
3. Go to 1 
Recall: 
1. Present incomplete pattern 
2. Pick random node, update 
3. Go to 2 until settled 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Hopfield Networks 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Hopfield Networks 
• Memories are attractors in state space 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Catastrophic Forgetting 
• Problem: memorising new patterns corrupts the memory of older ones 
Æ Old memories cannot be recalled, or spurious memories arise 
• Solution: allow Hopfield net to sleep 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Solutions 
ƒ Unlearning (Hopfield, 1986) 
- Recall old memories by random stimulation, but use an inverse 
Hebb rule 
Æ‘Makes room’ for new memories (basins of attraction shrink) 
ƒ Pseudorehearsal (Robins, 1995) 
- While learning new memories, recall old memories by random 
stimulation 
- Use standard Hebb rule on new and old memories 
Æ Restructure memory 
• Needs short-term + long term memory 
- Mammals: hippocampus plays back new memories to neo-cortex, 
which is randomly stimulated at the same time 
© torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
Unsupervised Learning
Unsupervised (Self-Organized) Learning 
feed-forward (supervised) 
feed-forward + lateral feedback 
(recurrent network, still supervised) 
self-organizing network (unsupervised) 
input layer output layer 
input layer output layer 
continuous 
input 
space 
discrete 
output 
space
Self Organizing Map (SOM) 
neural lattice 
input signal space 
Kohonen, 1984
Illustration of Kohonen Learning 
Inputs: coordinates (x,y) of points 
drawn from a square 
Display neuron j at position xj,yj where 
its sj is maximum 
random initial positions 
100 inputs 200 inputs 
1000 inputs 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Why use Kohonen Maps? 
• Image Analysis 
- Image Classification 
• Data Visualization 
- By projection from high D -> 2D 
Preserving neighborhood relationships 
• Partitioning Input Space 
Vector Quantization (Coding) 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Example: 
Modeling of the somatosensory map of the hand (Ritter, Martinetz & 
Schulten, 1992).
Example: 
Modeling of the somatosensory map of the hand (Ritter, Martinetz & 
Schulten, 1992).
Example: 
Modeling of the somatosensory map of the hand (Ritter, Martinetz & 
Schulten, 1992).
Example: 
Modeling of the somatosensory map of the hand (Ritter, Martinetz & 
Schulten, 1992).
Example: 
Modeling of the somatosensory map of the hand (Ritter, Martinetz & 
Schulten, 1992).
Example: 
Modeling of the somatosensory map of the hand (Ritter, Martinetz & 
Schulten, 1992).
Representing Topology 
with the Kohonen SOM 
• free neurons from lattice… 
• stimulus–dependent connectivities
The “Neural Gas” Algorithm 
(Martinetz & Schulten, 1992) 
connectivity matrix: 
Cij { 0, 1} 
age matrix: 
Tij {0,…,T} 
stimulus
Example
Example (cont.)
More Examples: Torus and Myosin S1
Growing Neural Gas 
GNG = Neural gas & 
dynamical creation/removal of links 
© http://www.neuroinformatik.ruhr-uni-bochum.de
Why use GNG ? 
• Adaptability to Data Topology 
ƒ Both dynamically and spatially 
• Data Analysis 
• Data Visualization 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Radial Basis Function Networks 
Outputs as 
linear 
combination of 
hidden layer of 
RBF neurons 
Inputs 
(fan in) 
Usually apply a 
unsupervised learning 
procedure 
•Set number of neurons 
and then adjust : 
1.Gaussian centers 
2.Gaussian widths 
3.weights 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Why use RBF ? 
• Density estimation 
• Discrimination 
• Regression 
• Good to know: 
ƒ Can be described as Bayesian Networks 
ƒ Close to some Fuzzy Systems 
© Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
Demo 
Internet Java demo http://www.neuroinformatik.ruhr-uni-bochum.de/ini/VDM/research/gsn/DemoGNG/GNG.html 
• HebbRule 
• LBG / k-means 
• Neural Gas 
• GNG 
• Kohonen SOM
Revisiting Quantization
Vector Quantization 
Lloyd (1957) 
Linde, Buzo, & Gray (1980) 
Martinetz & Schulten (1993) 
Digital Signal Processing, 
Speech and Image Compression. 
Neural Gas. 
} 
Encode data (in ℜ D ) using a finite set { w j } (j=1,…,k) of codebook vectors. 
Delaunay triangulation divides ℜ D into k Voronoi polyhedra (“receptive fields”): 
V i = { v ∈ℜ D v − w i ≤ v − w j ∀j 
}
Vector Quantization
k-Means a.k.a. Linde, Buzo & Gray (LBG) 
Encoding Distortion Error: 
2 
E = Σ vi −wj i d 
(data points) ( ) i 
i 
Lower E ( { w ( t ) } ) iteratively: Gradient descent j 
∀r : 
w t w t w t E v w d 
∂ Σ 
r ( ) r ( ) r ( 1 ) rj ( i ) ( i r ) i 
. 
2 w 
r i 
ε 
ε δ 
∂ 
Δ ≡ − − = − ⋅ = ⋅ − 
v (t) i : i d 
Inline (Monte Carlo) approach for a sequence selected at random 
according to propability density function 
( ) ~ ( ( ) ) . r rj(i) i r Δw t = ε ⋅δ ⋅ v t −w 
Advantage: fast, reasonable clustering. 
Limitations: depends on initial random positions, 
difficult to avoid getting trapped in the many local minima of E
Neural Gas Revisited 
Avoid local minima traps of k-means by smoothing of energy function: 
r 
− 
r r w t e v t w 
∀ Δ = ⋅ ⋅ − 
( ( ) { }) r i j s v t , w 
s 
: ( ) ~ ε λ 
( i ( ) r 
) , Where is the closeness rank: 
v w v w v w 
− ≤ − ≤ ≤ − − 
s s s k 
i j i j … i j k 
0 1 ( 1) 
= = = − 
0 1 1 
r r r
Neural Gas Revisited 
λ →0 : 
λ ≠ 0 : j(i ) w 
({ } ) k 2 
λ  E w λ 
= Σ e Σ vi − 
i wj d , ( ) . 
r 1 
sr 
j i 
i 
− 
= 
Note: k-means. 
not only “winner” , also second, third, ... closest are updated. 
Can show that this corresponds to stochastic gradient descent on 
λ →0 : E~→ E . 
λ → ∞ : E~ } ⇒ λ (t) 
Note: k-means. 
parabolic (single minimum).
Neural Gas Revisited 
Q: How do we know that we have found the global minimum of E? 
A: We don’t (in general). 
{ } j w 
But we can compute the statistical variability of the by repeating the 
calculation with different seeds for random number generator. 
Codebook vector variability arises due to: 
• statistical uncertainty, 
• spread of local minima. 
A small variability indicates good convergence behavior. 
Optimum choice of # of vectors k: variability is minimal.
Pattern Recognition
Pattern Recognition 
Definition: “The assignment of a physical object or event to one 
of several prespecified categeries” -- Duda  Hart 
• Apattern is an object, process or event that can be given a name. 
• Apattern class (or category) is a set of patterns sharing common attributes and 
usually originating from the same source. 
• During recognition (or classification) given objects are assigned to prescribed 
classes. 
• A classifier is a machine which performs classification. 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
PR Applications 
• Optical Character 
Recognition (OCR) 
• Biometrics 
• Diagnostic systems 
• Handwritten: sorting letters by postal code, 
input device for PDA‘s. 
• Printed texts: reading machines for blind 
people, digitalization of text documents. 
• Face recognition, verification, retrieval. 
• Finger prints recognition. 
• Speech recognition. 
• Medical diagnosis: X-Ray, EKG analysis. 
• Machine diagnostics, waster detection. 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Approaches 
• Statistical PR: based on underlying statistical model of patterns and pattern 
classes. 
• Structural (or syntactic) PR: pattern classes represented by means of formal 
structures as grammars, automata, strings, etc. 
• Neural networks: classifier is represented as a network of cells modeling 
neurons of the human brain (connectionist approach). 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Basic Concepts 
1 ⎤ 
Feature vector 
⎥ ⎥ ⎥ ⎥ ⎦ 
⎡ 
⎢ ⎢ ⎢ ⎢ 
x 
x 
y = x 
⎣ 
2 
# 
n x 
- A vector of observations (measurements). 
- is a point in feature space . 
Hidden state 
- Cannot be directly measured. 
- Patterns with equal hidden state belong to the same class. 
x∈X 
x X 
y∈Y 
Task 
- To design a classifer (decision rule) 
q : X →Y 
which decides about a hidden state based on an onbservation. 
Pattern 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Example 
⎤ 
x = ⎥⎦ 
⎡ 
⎢⎣ 
x 
1 
x 
2 
height 
weight 
Task: jockey-hoopster recognition. 
The set of hidden state is 
The feature space is 
Y = {H, J} 
X = ℜ2 
Training examples {( , ), , ( , )} 1 1 l l x y … x y 
Linear classifier: y = H 
1 x 
2 x 
y = J 
w x 
H if b 
⎩ ⎨ ⎧ 
( ⋅ ) + ≥ 
0 
w x 
⋅ +  
= 
( ) 0 
q( ) 
J if b 
x 
(w⋅x)+b=0 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Components of a PR System 
Sensors and 
preprocessing 
Feature 
extraction Classifier Class 
assignment 
Teacher Learning algorithm 
Pattern 
• Sensors and preprocessing. 
• A feature extraction aims to create discriminative features good for classification. 
• A classifier. 
• A teacher provides information about hidden state -- supervised learning. 
• A learning algorithm sets PR from training examples. 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Feature Extraction 
Task: to extract features which are good for classification. 
Good features: • Objects from the same class have similar feature values. 
• Objects from different classes have different values. 
“Good” features “Bad” features 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Feature Extraction Methods 
⎤ 
⎥ ⎥ ⎥ ⎥ 
⎦ 
⎡ 
⎢ ⎢ ⎢ ⎢ 
⎣ 
m 
1 
m 
# 
2 
k m 
⎤ 
⎤ 
⎡ 
m 
1 
n φ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ 
⎥ ⎥ ⎥ ⎥ 
⎦ 
⎡ 
1 1 φ 
⎢ ⎢ ⎢ ⎢ 
⎣ 
x 
x 
2 
# 
n x 
2 φ 
⎦ 
⎢ ⎢ ⎢ ⎢ ⎢ ⎢ 
⎣ 
m 
2 
m 
# 
3 
k m 
⎤ 
⎥ ⎥ ⎥ ⎥ 
⎦ 
⎡ 
⎢ ⎢ ⎢ ⎢ 
⎣ 
x 
1 
x 
2 
# 
n x 
Feature extraction Feature selection 
φ(θ) 
Problem can be expressed as optimization of parameters of featrure extractor . 
Supervised methods: objective function is a criterion of separability 
(discriminability) of labeled examples, e.g., linear discriminant analysis (LDA). 
Unsupervised methods: lower dimesional representation which preserves important 
characteristics of input data is sought for, e.g., principal component analysis (PCA). 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Classifier 
A classifier partitions feature space X into class-labeled regions such that 
1 2 |Y| X = X ∪X ∪…∪X {0} 1 2 | | ∩ ∩ ∩ = Y and X X … X 
1 X 3 X 
2 X 
1 X 
1 X 
2 X 
3 X 
The classification consists of determining to which region a feature vector x belongs to. 
Borders between decision boundaries are called decision regions. 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Representation of a Classifier 
A classifier is typically represented as a set of discriminant functions 
f ( ) : X →ℜ,i =1,…,|Y | i x 
The classifier assigns a feature vector x to the i-the class if f (x) f (x) i j  ∀j ≠ i 
f ( ) 1 x 
f ( ) 2 x 
x max y 
# Class identifier 
f ( ) | | x Y 
Feature vector 
Discriminant function 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Bayesian Decision Making 
• The Bayesian decision making is a fundamental statistical approach which 
allows to design the optimal classifier if complete statistical model is known. 
Definition: Obsevations 
Hidden states 
Decisions 
A loss function 
A decision rule 
D A joint probability 
q : X →D 
p(x,y) 
X 
Y 
W :Y ×D →R 
Task: to design decision rule q which minimizes Bayesian risk 
Σ Σ 
∈ ∈ 
R(q) = 
p(x, y)W(q(x), y) 
y Y x X 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Example of a Bayesian Task 
Task: minimization of classification error. 
A set of decisions D is the same as set of hidden states Y. 
0 q( x 
) 
0/1 - loss function used 
⎩ ⎨ ⎧ 
if = 
y 
≠ 
= 
if y 
y 
1 q( ) 
W(q( ), ) 
x 
x 
The Bayesian risk R(q) corresponds to probability of 
misclassification. 
The solution of Bayesian task is 
y p y x x y y 
q argminR(q) * argmax ( | ) argmax p( | ) p( ) 
p( ) 
= ⇒ = = 
q 
* 
x 
y y 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Limitations of the Bayesian Approach 
• The statistical model p(x,y) is mostly not known therefore 
learning must be employed to estimate p(x,y) from training 
examples {(x1,y1),…,(xA,yA)} -- plug-in Bayes. 
• Non-Bayesian methods offers further task formulations: 
• A partial statistical model is avaliable only: 
• p(y) is not known or does not exist. 
• p(x|y,θ) is influenced by a non-random intervetion θ. 
• The loss function is not defined. 
• Examples: Neyman-Pearson‘s task, Minimax task, etc. 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Discriminative Approaches 
Given a class of classification rules q(x;θ) parametrized by θ∈Ξ 
the task is to find the “best” parameter θ* based on a set of 
training examples {(x1,y1),…,(xA,yA)} -- supervised learning. 
The task of learning: recognition which classification rule is 
to be used. 
The way how to perform the learning is determined by a 
selected inductive principle. 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Empirical Risk Minimization Principle 
The true expected risk R(q) is approximated by empirical risk 
emp W(q( ; ), ) R (q( ; )) 1 
Σ= 
x θ = 
x i θ y 
i A 
A i 
1 
with respect to a given labeled training set {(x1,y1),…,(xA,yA)}. 
The learning based on the empirical minimization principle is 
defined as 
θ* x θ 
argmin R (q( ; )) emp 
θ 
= 
Examples of algorithms: Perceptron, Back-propagation, etc. 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Overfitting and Underfitting 
Problem: how rich class of classifications q(x;θ) to use. 
underfitting good fit overfitting 
Problem of generalization: a small emprical risk Remp does not 
imply small true expected risk R. 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Structural Risk Minimization Principle 
Statistical learning theory -- Vapnik  Chervonenkis. 
An upper bound on the expected risk of a classification rule q∈Q 
R(q) R (q) R (1 , , log 1 ) 
σ 
h emp str A 
≤ + 
where A is number of training examples, h is VC-dimension of class 
of functions Q and 1-σ is confidence of the upper bound. 
SRM principle: from a given nested function classes Q1,Q2,…,Qm, 
such that 
m h1 ≤ h2 ≤…≤ h 
select a rule q* which minimizes the upper bound on the expected risk. 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Unsupervised Learning 
Input: training examples {x1,…,xA} without information about the 
hidden state. 
Clustering: goal is to find clusters of data sharing similar properties. 
A broad class of unsupervised learning algorithms: 
{ , , } x1 … xA { , , } y1 … yA 
Classifier 
θ 
Learning 
algorithm 
Classifier 
q : X ×Θ →Y 
L : (X ×Y)A →Θ 
Learning algorithm 
(supervised) 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Example 
k-Means Clustering: 
Classifier 
= x = x − 
y w 
q( ) arg min || || i 
i k 
1, , 
= 
… 
Goal is to minimize 
2 
A 
Σ − x x 
q( ) 
1 
|| || i i 
i 
w 
= 
Learning algorithm 
1 , 
| | i 
= Σ x 
I I 
i j 
i j 
w 
∈ 
{ j : q( ) i} i j I = x = 
1 w 
2 w 
3 w 
{ , , } x1 … xA 
1 { , , } k θ = w … w 
{ , , } y1 … yA 
© Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
Neural Network References 
• Neural Networks, a Comprehensive Foundation, S. Haykin, ed. Prentice Hall 
(1999) 
• Neural Networks for Pattern Recognition, C. M. Bishop, ed Claredon Press, 
Oxford (1997) 
• Self Organizing Maps, T. Kohonen, Springer (2001)
Some ANN Toolboxes 
• Free software 
ƒ SNNS: Stuttgarter Neural Network Systems  Java NNS 
ƒ GNG at Uni Bochum 
• Matlab toolboxes 
ƒ Fuzzy Logic 
ƒ Artificial Neural Networks 
ƒ Signal Processing
Pattern Recognition / 
Vector Quantization References 
Textbooks 
Duda, Heart: Pattern Classification and Scene Analysis. J. Wiley  Sons, New York, 
1982. (2nd edition 2000). 
Fukunaga: Introduction to Statistical Pattern Recognition. Academic Press, 1990. 
Bishop: Neural Networks for Pattern Recognition. Claredon Press, Oxford, 1997. 
Schlesinger, Hlaváč: Ten lectures on statistical and structural pattern recognition. 
Kluwer Academic Publisher, 2002.

More Related Content

What's hot

ARTIFICIAL NEURAL NETWORKS
ARTIFICIAL NEURAL NETWORKSARTIFICIAL NEURAL NETWORKS
ARTIFICIAL NEURAL NETWORKS
AIMS Education
 
Fundamental, An Introduction to Neural Networks
Fundamental, An Introduction to Neural NetworksFundamental, An Introduction to Neural Networks
Fundamental, An Introduction to Neural Networks
Nelson Piedra
 
Artificial neural network model & hidden layers in multilayer artificial neur...
Artificial neural network model & hidden layers in multilayer artificial neur...Artificial neural network model & hidden layers in multilayer artificial neur...
Artificial neural network model & hidden layers in multilayer artificial neur...
Muhammad Ishaq
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural network
Nagarajan
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Randa Elanwar
 
Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)
spartacus131211
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
Sopheaktra YONG
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
AkshanshAgarwal4
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its application
Hưng Đặng
 
Artificial Neural Network Paper Presentation
Artificial Neural Network Paper PresentationArtificial Neural Network Paper Presentation
Artificial Neural Network Paper Presentation
guestac67362
 
Artificial neural networks seminar presentation using MSWord.
Artificial neural networks seminar presentation using MSWord.Artificial neural networks seminar presentation using MSWord.
Artificial neural networks seminar presentation using MSWord.
Mohd Faiz
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
Mohamed Talaat
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
Mohd Arafat Shaikh
 
Neural networks introduction
Neural networks introductionNeural networks introduction
Neural networks introduction
آيةالله عبدالحكيم
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
DEEPASHRI HK
 
Nural network ER. Abhishek k. upadhyay
Nural network ER. Abhishek  k. upadhyayNural network ER. Abhishek  k. upadhyay
Nural network ER. Abhishek k. upadhyay
abhishek upadhyay
 
neural networks
 neural networks neural networks
Neural Networks
Neural NetworksNeural Networks
Neural Networks
NikitaRuhela
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
IshaneeSharma
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
madhu sudhakar
 

What's hot (20)

ARTIFICIAL NEURAL NETWORKS
ARTIFICIAL NEURAL NETWORKSARTIFICIAL NEURAL NETWORKS
ARTIFICIAL NEURAL NETWORKS
 
Fundamental, An Introduction to Neural Networks
Fundamental, An Introduction to Neural NetworksFundamental, An Introduction to Neural Networks
Fundamental, An Introduction to Neural Networks
 
Artificial neural network model & hidden layers in multilayer artificial neur...
Artificial neural network model & hidden layers in multilayer artificial neur...Artificial neural network model & hidden layers in multilayer artificial neur...
Artificial neural network model & hidden layers in multilayer artificial neur...
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural network
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
 
Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its application
 
Artificial Neural Network Paper Presentation
Artificial Neural Network Paper PresentationArtificial Neural Network Paper Presentation
Artificial Neural Network Paper Presentation
 
Artificial neural networks seminar presentation using MSWord.
Artificial neural networks seminar presentation using MSWord.Artificial neural networks seminar presentation using MSWord.
Artificial neural networks seminar presentation using MSWord.
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Neural networks introduction
Neural networks introductionNeural networks introduction
Neural networks introduction
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Nural network ER. Abhishek k. upadhyay
Nural network ER. Abhishek  k. upadhyayNural network ER. Abhishek  k. upadhyay
Nural network ER. Abhishek k. upadhyay
 
neural networks
 neural networks neural networks
neural networks
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 

Viewers also liked

Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
Ahmed_hashmi
 
Brain, Neurons, and Spinal Cord
Brain, Neurons, and Spinal CordBrain, Neurons, and Spinal Cord
Brain, Neurons, and Spinal Cord
jmorgan80
 
Use of artificial neural network in pattern recognition
Use of artificial neural network in pattern recognitionUse of artificial neural network in pattern recognition
Use of artificial neural network in pattern recognition
kamalsrit
 
Artificial intelligence Pattern recognition system
Artificial intelligence Pattern recognition systemArtificial intelligence Pattern recognition system
Artificial intelligence Pattern recognition system
REHMAT ULLAH
 
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Mohammed Bennamoun
 
lecture07.ppt
lecture07.pptlecture07.ppt
lecture07.ppt
butest
 
Introduction to pattern recognition
Introduction to pattern recognitionIntroduction to pattern recognition
Introduction to pattern recognition
Luís Gustavo Martins
 
Artificial intelligence NEURAL NETWORKS
Artificial intelligence NEURAL NETWORKSArtificial intelligence NEURAL NETWORKS
Artificial intelligence NEURAL NETWORKS
REHMAT ULLAH
 
208-EEI-49
208-EEI-49208-EEI-49
208-EEI-49
Promit Mukherjee
 
Aula 02 - Seminário Sobre a Igreja (Segunda Temporada)
Aula 02 - Seminário Sobre a Igreja (Segunda Temporada)Aula 02 - Seminário Sobre a Igreja (Segunda Temporada)
Aula 02 - Seminário Sobre a Igreja (Segunda Temporada)
IBC de Jacarepaguá
 
Fraire systems
Fraire systemsFraire systems
Fraire systems
Fraire Systems
 
Teacher's Kit for Interactive Journalism by Juliana Ruhfus
Teacher's Kit for Interactive Journalism by Juliana RuhfusTeacher's Kit for Interactive Journalism by Juliana Ruhfus
Teacher's Kit for Interactive Journalism by Juliana Ruhfus
julianaruhfus
 
презентация 2
презентация 2 презентация 2
презентация 2
HeratOfUral
 
історія обчилювальної техніки
історія обчилювальної технікиісторія обчилювальної техніки
історія обчилювальної техніки
igoryacok36
 
El logaritmo
El logaritmoEl logaritmo
El logaritmo
GiovAnna94
 
Muka depan content
Muka depan contentMuka depan content
Muka depan contentmohd admee
 
Assistive Technology Presentation
Assistive Technology Presentation Assistive Technology Presentation
Assistive Technology Presentation
awood4127
 
Aula 1 - Seminário Sobre a Igreja...
Aula 1 - Seminário Sobre a Igreja...Aula 1 - Seminário Sobre a Igreja...
Aula 1 - Seminário Sobre a Igreja...
IBC de Jacarepaguá
 
основные масштабные факторы микро , макро-, мегамира
основные масштабные факторы микро , макро-, мегамираосновные масштабные факторы микро , макро-, мегамира
основные масштабные факторы микро , макро-, мегамираAlina Yanchuk
 
XzavianCarter Unit 1 Final Project
XzavianCarter Unit 1 Final ProjectXzavianCarter Unit 1 Final Project
XzavianCarter Unit 1 Final Project
XzavianCarter
 

Viewers also liked (20)

Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
 
Brain, Neurons, and Spinal Cord
Brain, Neurons, and Spinal CordBrain, Neurons, and Spinal Cord
Brain, Neurons, and Spinal Cord
 
Use of artificial neural network in pattern recognition
Use of artificial neural network in pattern recognitionUse of artificial neural network in pattern recognition
Use of artificial neural network in pattern recognition
 
Artificial intelligence Pattern recognition system
Artificial intelligence Pattern recognition systemArtificial intelligence Pattern recognition system
Artificial intelligence Pattern recognition system
 
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
 
lecture07.ppt
lecture07.pptlecture07.ppt
lecture07.ppt
 
Introduction to pattern recognition
Introduction to pattern recognitionIntroduction to pattern recognition
Introduction to pattern recognition
 
Artificial intelligence NEURAL NETWORKS
Artificial intelligence NEURAL NETWORKSArtificial intelligence NEURAL NETWORKS
Artificial intelligence NEURAL NETWORKS
 
208-EEI-49
208-EEI-49208-EEI-49
208-EEI-49
 
Aula 02 - Seminário Sobre a Igreja (Segunda Temporada)
Aula 02 - Seminário Sobre a Igreja (Segunda Temporada)Aula 02 - Seminário Sobre a Igreja (Segunda Temporada)
Aula 02 - Seminário Sobre a Igreja (Segunda Temporada)
 
Fraire systems
Fraire systemsFraire systems
Fraire systems
 
Teacher's Kit for Interactive Journalism by Juliana Ruhfus
Teacher's Kit for Interactive Journalism by Juliana RuhfusTeacher's Kit for Interactive Journalism by Juliana Ruhfus
Teacher's Kit for Interactive Journalism by Juliana Ruhfus
 
презентация 2
презентация 2 презентация 2
презентация 2
 
історія обчилювальної техніки
історія обчилювальної технікиісторія обчилювальної техніки
історія обчилювальної техніки
 
El logaritmo
El logaritmoEl logaritmo
El logaritmo
 
Muka depan content
Muka depan contentMuka depan content
Muka depan content
 
Assistive Technology Presentation
Assistive Technology Presentation Assistive Technology Presentation
Assistive Technology Presentation
 
Aula 1 - Seminário Sobre a Igreja...
Aula 1 - Seminário Sobre a Igreja...Aula 1 - Seminário Sobre a Igreja...
Aula 1 - Seminário Sobre a Igreja...
 
основные масштабные факторы микро , макро-, мегамира
основные масштабные факторы микро , макро-, мегамираосновные масштабные факторы микро , макро-, мегамира
основные масштабные факторы микро , макро-, мегамира
 
XzavianCarter Unit 1 Final Project
XzavianCarter Unit 1 Final ProjectXzavianCarter Unit 1 Final Project
XzavianCarter Unit 1 Final Project
 

Similar to Lecture artificial neural networks and pattern recognition

JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
hirokazutanaka
 
02 Fundamental Concepts of ANN
02 Fundamental Concepts of ANN02 Fundamental Concepts of ANN
02 Fundamental Concepts of ANN
Tamer Ahmed Farrag, PhD
 
Artificial Neural networks
Artificial Neural networksArtificial Neural networks
Artificial Neural networks
Learnbay Datascience
 
w1-01-introtonn.ppt
w1-01-introtonn.pptw1-01-introtonn.ppt
w1-01-introtonn.ppt
KotaGuru1
 
Modeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural NetworksModeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural Networks
Josh Patterson
 
20141003.journal club
20141003.journal club20141003.journal club
20141003.journal club
Hayaru SHOUNO
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
Ismail El Gayar
 
Perceptron
PerceptronPerceptron
Perceptron
Nagarajan
 
ANN - UNIT 1.pptx
ANN - UNIT 1.pptxANN - UNIT 1.pptx
Lect1_Threshold_Logic_Unit lecture 1 - ANN
Lect1_Threshold_Logic_Unit  lecture 1 - ANNLect1_Threshold_Logic_Unit  lecture 1 - ANN
Lect1_Threshold_Logic_Unit lecture 1 - ANN
MostafaHazemMostafaa
 
7 nn1-intro.ppt
7 nn1-intro.ppt7 nn1-intro.ppt
7 nn1-intro.ppt
Sambit Satpathy
 
Artificial Neural Network and its Applications
Artificial Neural Network and its ApplicationsArtificial Neural Network and its Applications
Artificial Neural Network and its Applications
shritosh kumar
 
Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0
Joe Xing
 
Neural
NeuralNeural
Neural
Vaibhav Shah
 
Ann model and its application
Ann model and its applicationAnn model and its application
Ann model and its application
milan107
 
DATA SCIENCE
DATA SCIENCEDATA SCIENCE
DATA SCIENCE
HarshikaBansal1
 
BACKPROPOGATION ALGO.pdfLECTURE NOTES WITH SOLVED EXAMPLE AND FEED FORWARD NE...
BACKPROPOGATION ALGO.pdfLECTURE NOTES WITH SOLVED EXAMPLE AND FEED FORWARD NE...BACKPROPOGATION ALGO.pdfLECTURE NOTES WITH SOLVED EXAMPLE AND FEED FORWARD NE...
BACKPROPOGATION ALGO.pdfLECTURE NOTES WITH SOLVED EXAMPLE AND FEED FORWARD NE...
DurgadeviParamasivam
 
A04401001013
A04401001013A04401001013
A04401001013
ijceronline
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
ssuserab4f3e
 
Dr. kiani artificial neural network lecture 1
Dr. kiani artificial neural network lecture 1Dr. kiani artificial neural network lecture 1
Dr. kiani artificial neural network lecture 1
Parinaz Faraji
 

Similar to Lecture artificial neural networks and pattern recognition (20)

JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
 
02 Fundamental Concepts of ANN
02 Fundamental Concepts of ANN02 Fundamental Concepts of ANN
02 Fundamental Concepts of ANN
 
Artificial Neural networks
Artificial Neural networksArtificial Neural networks
Artificial Neural networks
 
w1-01-introtonn.ppt
w1-01-introtonn.pptw1-01-introtonn.ppt
w1-01-introtonn.ppt
 
Modeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural NetworksModeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural Networks
 
20141003.journal club
20141003.journal club20141003.journal club
20141003.journal club
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Perceptron
PerceptronPerceptron
Perceptron
 
ANN - UNIT 1.pptx
ANN - UNIT 1.pptxANN - UNIT 1.pptx
ANN - UNIT 1.pptx
 
Lect1_Threshold_Logic_Unit lecture 1 - ANN
Lect1_Threshold_Logic_Unit  lecture 1 - ANNLect1_Threshold_Logic_Unit  lecture 1 - ANN
Lect1_Threshold_Logic_Unit lecture 1 - ANN
 
7 nn1-intro.ppt
7 nn1-intro.ppt7 nn1-intro.ppt
7 nn1-intro.ppt
 
Artificial Neural Network and its Applications
Artificial Neural Network and its ApplicationsArtificial Neural Network and its Applications
Artificial Neural Network and its Applications
 
Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0
 
Neural
NeuralNeural
Neural
 
Ann model and its application
Ann model and its applicationAnn model and its application
Ann model and its application
 
DATA SCIENCE
DATA SCIENCEDATA SCIENCE
DATA SCIENCE
 
BACKPROPOGATION ALGO.pdfLECTURE NOTES WITH SOLVED EXAMPLE AND FEED FORWARD NE...
BACKPROPOGATION ALGO.pdfLECTURE NOTES WITH SOLVED EXAMPLE AND FEED FORWARD NE...BACKPROPOGATION ALGO.pdfLECTURE NOTES WITH SOLVED EXAMPLE AND FEED FORWARD NE...
BACKPROPOGATION ALGO.pdfLECTURE NOTES WITH SOLVED EXAMPLE AND FEED FORWARD NE...
 
A04401001013
A04401001013A04401001013
A04401001013
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Dr. kiani artificial neural network lecture 1
Dr. kiani artificial neural network lecture 1Dr. kiani artificial neural network lecture 1
Dr. kiani artificial neural network lecture 1
 

More from Hưng Đặng

Tao cv-tieng-anhby pdfcv
Tao cv-tieng-anhby pdfcvTao cv-tieng-anhby pdfcv
Tao cv-tieng-anhby pdfcv
Hưng Đặng
 
Tai lieutonghop.com --mau-cv-curriculum-vitae-bang-tieng-viet
Tai lieutonghop.com --mau-cv-curriculum-vitae-bang-tieng-vietTai lieutonghop.com --mau-cv-curriculum-vitae-bang-tieng-viet
Tai lieutonghop.com --mau-cv-curriculum-vitae-bang-tieng-viet
Hưng Đặng
 
Dhq-cv-tieng-anh
 Dhq-cv-tieng-anh Dhq-cv-tieng-anh
Dhq-cv-tieng-anh
Hưng Đặng
 
Cv it developer
Cv it developerCv it developer
Cv it developer
Hưng Đặng
 
Cach viết-cv
Cach viết-cvCach viết-cv
Cach viết-cv
Hưng Đặng
 
Cach viet cv hay
Cach viet cv hayCach viet cv hay
Cach viet cv hay
Hưng Đặng
 
Nonlinear image processing using artificial neural
Nonlinear image processing using artificial neuralNonlinear image processing using artificial neural
Nonlinear image processing using artificial neural
Hưng Đặng
 
Mlp trainning algorithm
Mlp trainning algorithmMlp trainning algorithm
Mlp trainning algorithm
Hưng Đặng
 
Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...
Hưng Đặng
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its application
Hưng Đặng
 
Lecture artificial neural networks and pattern recognition
Lecture   artificial neural networks and pattern recognitionLecture   artificial neural networks and pattern recognition
Lecture artificial neural networks and pattern recognition
Hưng Đặng
 
Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...
Hưng Đặng
 

More from Hưng Đặng (12)

Tao cv-tieng-anhby pdfcv
Tao cv-tieng-anhby pdfcvTao cv-tieng-anhby pdfcv
Tao cv-tieng-anhby pdfcv
 
Tai lieutonghop.com --mau-cv-curriculum-vitae-bang-tieng-viet
Tai lieutonghop.com --mau-cv-curriculum-vitae-bang-tieng-vietTai lieutonghop.com --mau-cv-curriculum-vitae-bang-tieng-viet
Tai lieutonghop.com --mau-cv-curriculum-vitae-bang-tieng-viet
 
Dhq-cv-tieng-anh
 Dhq-cv-tieng-anh Dhq-cv-tieng-anh
Dhq-cv-tieng-anh
 
Cv it developer
Cv it developerCv it developer
Cv it developer
 
Cach viết-cv
Cach viết-cvCach viết-cv
Cach viết-cv
 
Cach viet cv hay
Cach viet cv hayCach viet cv hay
Cach viet cv hay
 
Nonlinear image processing using artificial neural
Nonlinear image processing using artificial neuralNonlinear image processing using artificial neural
Nonlinear image processing using artificial neural
 
Mlp trainning algorithm
Mlp trainning algorithmMlp trainning algorithm
Mlp trainning algorithm
 
Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its application
 
Lecture artificial neural networks and pattern recognition
Lecture   artificial neural networks and pattern recognitionLecture   artificial neural networks and pattern recognition
Lecture artificial neural networks and pattern recognition
 
Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...
 

Recently uploaded

[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
awadeshbabu
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
Series of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.pptSeries of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.ppt
PauloRodrigues104553
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
nooriasukmaningtyas
 
bank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdfbank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdf
Divyam548318
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
mamunhossenbd75
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
University of Maribor
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
nooriasukmaningtyas
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 

Recently uploaded (20)

[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
Series of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.pptSeries of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.ppt
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
 
bank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdfbank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdf
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 

Lecture artificial neural networks and pattern recognition

  • 1. T H E UN I V E R S I T Y of TE X A S HE A L T H S C I E N C E CE N T E R A T HO U S T O N S C H O O L of HE A L T H I N F O R M A T I O N S C I E N C E S Artificial Neural Networks and Pattern Recognition For students of HI 5323 “Image Processing” Willy Wriggers, Ph.D. School of Health Information Sciences http://biomachina.org/courses/processing/13.html
  • 3. What are Neural Networks? • Models of the brain and nervous system • Highly parallel ƒ Process information much more like the brain than a serial computer • Learning • Very simple principles • Very complex behaviours • Applications ƒ As powerful problem solvers ƒ As biological models © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 4. Neuro- Physiological Background • 10 billion neurons in human cortex • 60 trillion synapses • In first two years from birth ~1 million synapses / sec. formed pyramidal cell
  • 8. Modeling the Neuron bias inputs h(w0 ,wi , xi ) y = f(h) y x1 w1 xi wi xn wn 1 w0 f : activation function output h : combine wi & xi © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 10. Common Activation Functions • Sigmoidal Function: nΣ y = f h= w0 ⋅1+ wi ⋅ xi • Radial Function, e.g.. Gaussian: • Linear Function i=1 ; ρ ⎛ ⎜ ⎝ ⎞ ⎠ ⎟ = 1 1+ e −h ρ nΣ y = f h= xi −wi ( )2 i=1 ;σ = w0 ⎛ ⎜ ⎝ ⎞ ⎠ ⎟ = 1 e 2πσ − h2 2σ 2 nΣ y = w0 ⋅1+ wi ⋅ xi i=1 © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 12. Artificial Neural Networks • ANNs incorporate the two fundamental components of biological neural nets: 1. Neurones (nodes) 2. Synapses (weights) Input Output © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 13. “Pidgeon” ANNs • Pigeons as art experts (Watanabe et al. 1995) • Experiment: - Pigeon in Skinner box - Present paintings of two different artists (e.g. Chagall / Van Gogh) - Reward for pecking when presented a particular artist (e.g. Van Gogh) © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 14. Training Set: © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt (etc…)
  • 15. Predictive Power: • Pigeons were able to discriminate between Van Gogh and Chagall with 95% accuracy (when presented with pictures they had been trained on) • Discrimination still 85% successful for previously unseen paintings of the artists. • Pigeons do not simply memorise the pictures • They can extract and recognise patterns (the ‘style’) • They generalise from the already seen to make predictions • This is what neural networks (biological and artificial) are good at (unlike conventional computer) © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 16. Real ANN Applications • Recognition of hand-written letters • Predicting on-line the quality of welding spots • Identifying relevant documents in corpus • Visualizing high-dimensional space • Tracking on-line the position of robot arms • …etc © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 17. ANN Design 1. Get a large amount of data: inputs and outputs 2. Analyze data on the PC z Relevant inputs ? z Linear correlations (ANN necessary) ? z Transform and scale variables z Other useful preprocessing ? z Divide in 3 data sets: Training set Test set Validation set © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 18. ANN Design 3. Set the ANN architecture: What type of ANN ? z Number of inputs, outputs ? z Number of hidden layers z Number of neurons z Learning schema « details » 4. Tune/optimize internal parameters by presenting training data set to ANN 5. Validate on test / validation dataset © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 19. Main Types of ANN Supervised Learning: ƒ Feed-forward ANN - Multi-Layer Perceptron (with sigmoid hidden neurons) ƒ Recurrent Networks - Neurons are connected to self and others - Time delay of signal transfer - Multidirectional information flow Unsupervised Learning: ƒ Self-organizing ANN - Kohonen Maps - Vector Quantization - Neural Gas
  • 20. Feed-Forward ANN • Information flow is unidirectional • Data is presented to Input layer • Passed on to Hidden Layer • Passed on to Output layer • Information is distributed • Information processing is parallel Internal representation (interpretation) of data © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 21. Supervised Learning Training set: {(μxin, μtout); 1 ≤ μ ≤ P} μ xout desired output (supervisor) μ t out μ xin error=μ xout −μ tout Typically: backprop. of errors - © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 22. Important Properties of FFN • Assume ƒ g(x): bounded and sufficiently regular fct. ƒ FFN with 1 hidden layer of finite N neurons (Transfer function is identical for every neurons) • => FFN is an Universal Approximator of g(x) Theorem by Cybenko et al. in 1989 In the sense of uniform approximation For arbitrary precision ε © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 23. • Assume Important Properties of FFN ƒ FFN as before (1 hidden layer of finite N neurons, non linear transfer function) ƒ Approximation precision ε • => #{wi} ~ # inputs Theorem by Barron in 1993 ANN is more parsimonious in #{wi} than a linear approximator [linear approximator: #{wi} ~ exp(# inputs) ] © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 24. Roughness of Output • Outputs depends of the whole set of weighted links {wij} • Example: output unit versus input 1 and input 2 for a 2*10*1 ANN with random weights © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 25. Feeding Data Through the FNN (1 × 0.25) + (0.5 × (-1.5)) = 0.25 + (-0.75) = - 0.5 0.3775 1 1 0.5 = + e Squashing: © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 26. Feeding Data Through the FNN • Data is presented to the network in the form of activations in the input layer • Examples ƒ Pixel intensity (for pictures) ƒ Molecule concentrations (for artificial nose) ƒ Share prices (for stock market prediction) • Data usually requires preprocessing ƒ Analogous to senses in biology • How to represent more abstract data, e.g. a name? ƒ Choose a pattern, e.g. - 0-0-1 for “Chris” - 0-1-0 for “Becky” © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 27. Training the Network How do we adjust the weights? • Backpropagation ƒ Requires training set (input / output pairs) ƒ Starts with small random weights ƒ Error is used to adjust weights (supervised learning) Æ Gradient descent on error landscape © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 28. Backpropagation © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 29. • Advantages ƒ It works! ƒ Relatively fast • Downsides Backpropagation ƒ Requires a training set ƒ Can be slow to converge ƒ Probably not biologically realistic • Alternatives to Backpropagation ƒ Hebbian learning - Not successful in feed-forward nets ƒ Reinforcement learning - Only limited success in FFN ƒ Artificial evolution - More general, but can be even slower than backprop © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 30. Applications of FFN ƒ Pattern recognition - Character recognition - Face Recognition ƒ Sonar mine/rock recognition (Gorman & Sejnowksi, 1988) ƒ Navigation of a car (Pomerleau, 1989) ƒ Stock-market prediction ƒ Pronunciation (NETtalk) (Sejnowksi & Rosenberg, 1987) © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 31. Protein Secondary Structure Prediction (Holley-Karplus, Ph.D., etc): Supervised learning: ƒ Adjust weight vectors so output of network matches desired result α-helical coil amino acid sequence
  • 32. Recurrent Networks • Feed forward networks: ƒ Information only flows one way ƒ One input pattern produces one output ƒ No sense of time (or memory of previous state) • Recurrency ƒ Nodes connect back to other nodes or themselves ƒ Information flow is multidirectional ƒ Sense of time and memory of previous state(s) • Biological nervous systems show high levels of recurrency (but feed-forward structures exists too) © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 33. Elman Nets • Elman nets are feed forward networks with partial recurrency • Unlike feed forward nets, Elman nets have a memory or sense of time © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 34. Elman Nets Classic experiment on language acquisition and processing (Elman, 1990) • Task ƒ Elman net to predict successive words in sentences. • Data ƒ Suite of sentences, e.g. - “The boy catches the ball.” - “The girl eats an apple.” ƒ Words are input one at a time • Representation ƒ Binary representation for each word, e.g. - 0-1-0-0-0 for “girl” • Training method ƒ Backpropagation © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 35. Elman Nets Internal representation of words © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 36. Hopfield Networks • Sub-type of recurrent neural nets ƒ Fully recurrent ƒ Weights are symmetric ƒ Nodes can only be on or off ƒ Random updating • Learning: Hebb rule (cells that fire together wire together) • Can recall a memory, if presented with a corrupt or incomplete version Æ auto-associative or content-addressable memory © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 37. Hopfield Networks Task: store images with resolution of 20x20 pixels Æ Hopfield net with 400 nodes Memorise: 1. Present image 2. Apply Hebb rule (cells that fire together, wire together) - Increase weight between two nodes if both have same activity, otherwise decrease 3. Go to 1 Recall: 1. Present incomplete pattern 2. Pick random node, update 3. Go to 2 until settled © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 38. Hopfield Networks © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 39. Hopfield Networks • Memories are attractors in state space © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 40. Catastrophic Forgetting • Problem: memorising new patterns corrupts the memory of older ones Æ Old memories cannot be recalled, or spurious memories arise • Solution: allow Hopfield net to sleep © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 41. Solutions ƒ Unlearning (Hopfield, 1986) - Recall old memories by random stimulation, but use an inverse Hebb rule Æ‘Makes room’ for new memories (basins of attraction shrink) ƒ Pseudorehearsal (Robins, 1995) - While learning new memories, recall old memories by random stimulation - Use standard Hebb rule on new and old memories Æ Restructure memory • Needs short-term + long term memory - Mammals: hippocampus plays back new memories to neo-cortex, which is randomly stimulated at the same time © torsten.reil@zoo.ox.ac.uk, users.ox.ac.uk/~quee0818/teaching/Neural_Networks.ppt
  • 43. Unsupervised (Self-Organized) Learning feed-forward (supervised) feed-forward + lateral feedback (recurrent network, still supervised) self-organizing network (unsupervised) input layer output layer input layer output layer continuous input space discrete output space
  • 44. Self Organizing Map (SOM) neural lattice input signal space Kohonen, 1984
  • 45. Illustration of Kohonen Learning Inputs: coordinates (x,y) of points drawn from a square Display neuron j at position xj,yj where its sj is maximum random initial positions 100 inputs 200 inputs 1000 inputs © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 46. Why use Kohonen Maps? • Image Analysis - Image Classification • Data Visualization - By projection from high D -> 2D Preserving neighborhood relationships • Partitioning Input Space Vector Quantization (Coding) © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 47. Example: Modeling of the somatosensory map of the hand (Ritter, Martinetz & Schulten, 1992).
  • 48. Example: Modeling of the somatosensory map of the hand (Ritter, Martinetz & Schulten, 1992).
  • 49. Example: Modeling of the somatosensory map of the hand (Ritter, Martinetz & Schulten, 1992).
  • 50. Example: Modeling of the somatosensory map of the hand (Ritter, Martinetz & Schulten, 1992).
  • 51. Example: Modeling of the somatosensory map of the hand (Ritter, Martinetz & Schulten, 1992).
  • 52. Example: Modeling of the somatosensory map of the hand (Ritter, Martinetz & Schulten, 1992).
  • 53. Representing Topology with the Kohonen SOM • free neurons from lattice… • stimulus–dependent connectivities
  • 54. The “Neural Gas” Algorithm (Martinetz & Schulten, 1992) connectivity matrix: Cij { 0, 1} age matrix: Tij {0,…,T} stimulus
  • 57. More Examples: Torus and Myosin S1
  • 58. Growing Neural Gas GNG = Neural gas & dynamical creation/removal of links © http://www.neuroinformatik.ruhr-uni-bochum.de
  • 59. Why use GNG ? • Adaptability to Data Topology ƒ Both dynamically and spatially • Data Analysis • Data Visualization © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 60. Radial Basis Function Networks Outputs as linear combination of hidden layer of RBF neurons Inputs (fan in) Usually apply a unsupervised learning procedure •Set number of neurons and then adjust : 1.Gaussian centers 2.Gaussian widths 3.weights © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 61. Why use RBF ? • Density estimation • Discrimination • Regression • Good to know: ƒ Can be described as Bayesian Networks ƒ Close to some Fuzzy Systems © Leonard Studer, humanresources.web.cern.ch/humanresources/external/training/ tech/special/DISP2003/DISP-2003_L21A_30Apr03.ppt
  • 62. Demo Internet Java demo http://www.neuroinformatik.ruhr-uni-bochum.de/ini/VDM/research/gsn/DemoGNG/GNG.html • HebbRule • LBG / k-means • Neural Gas • GNG • Kohonen SOM
  • 64. Vector Quantization Lloyd (1957) Linde, Buzo, & Gray (1980) Martinetz & Schulten (1993) Digital Signal Processing, Speech and Image Compression. Neural Gas. } Encode data (in ℜ D ) using a finite set { w j } (j=1,…,k) of codebook vectors. Delaunay triangulation divides ℜ D into k Voronoi polyhedra (“receptive fields”): V i = { v ∈ℜ D v − w i ≤ v − w j ∀j }
  • 66. k-Means a.k.a. Linde, Buzo & Gray (LBG) Encoding Distortion Error: 2 E = Σ vi −wj i d (data points) ( ) i i Lower E ( { w ( t ) } ) iteratively: Gradient descent j ∀r : w t w t w t E v w d ∂ Σ r ( ) r ( ) r ( 1 ) rj ( i ) ( i r ) i . 2 w r i ε ε δ ∂ Δ ≡ − − = − ⋅ = ⋅ − v (t) i : i d Inline (Monte Carlo) approach for a sequence selected at random according to propability density function ( ) ~ ( ( ) ) . r rj(i) i r Δw t = ε ⋅δ ⋅ v t −w Advantage: fast, reasonable clustering. Limitations: depends on initial random positions, difficult to avoid getting trapped in the many local minima of E
  • 67. Neural Gas Revisited Avoid local minima traps of k-means by smoothing of energy function: r − r r w t e v t w ∀ Δ = ⋅ ⋅ − ( ( ) { }) r i j s v t , w s : ( ) ~ ε λ ( i ( ) r ) , Where is the closeness rank: v w v w v w − ≤ − ≤ ≤ − − s s s k i j i j … i j k 0 1 ( 1) = = = − 0 1 1 r r r
  • 68. Neural Gas Revisited λ →0 : λ ≠ 0 : j(i ) w ({ } ) k 2 λ E w λ = Σ e Σ vi − i wj d , ( ) . r 1 sr j i i − = Note: k-means. not only “winner” , also second, third, ... closest are updated. Can show that this corresponds to stochastic gradient descent on λ →0 : E~→ E . λ → ∞ : E~ } ⇒ λ (t) Note: k-means. parabolic (single minimum).
  • 69. Neural Gas Revisited Q: How do we know that we have found the global minimum of E? A: We don’t (in general). { } j w But we can compute the statistical variability of the by repeating the calculation with different seeds for random number generator. Codebook vector variability arises due to: • statistical uncertainty, • spread of local minima. A small variability indicates good convergence behavior. Optimum choice of # of vectors k: variability is minimal.
  • 71. Pattern Recognition Definition: “The assignment of a physical object or event to one of several prespecified categeries” -- Duda Hart • Apattern is an object, process or event that can be given a name. • Apattern class (or category) is a set of patterns sharing common attributes and usually originating from the same source. • During recognition (or classification) given objects are assigned to prescribed classes. • A classifier is a machine which performs classification. © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 72. PR Applications • Optical Character Recognition (OCR) • Biometrics • Diagnostic systems • Handwritten: sorting letters by postal code, input device for PDA‘s. • Printed texts: reading machines for blind people, digitalization of text documents. • Face recognition, verification, retrieval. • Finger prints recognition. • Speech recognition. • Medical diagnosis: X-Ray, EKG analysis. • Machine diagnostics, waster detection. © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 73. Approaches • Statistical PR: based on underlying statistical model of patterns and pattern classes. • Structural (or syntactic) PR: pattern classes represented by means of formal structures as grammars, automata, strings, etc. • Neural networks: classifier is represented as a network of cells modeling neurons of the human brain (connectionist approach). © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 74. Basic Concepts 1 ⎤ Feature vector ⎥ ⎥ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎢ ⎢ x x y = x ⎣ 2 # n x - A vector of observations (measurements). - is a point in feature space . Hidden state - Cannot be directly measured. - Patterns with equal hidden state belong to the same class. x∈X x X y∈Y Task - To design a classifer (decision rule) q : X →Y which decides about a hidden state based on an onbservation. Pattern © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 75. Example ⎤ x = ⎥⎦ ⎡ ⎢⎣ x 1 x 2 height weight Task: jockey-hoopster recognition. The set of hidden state is The feature space is Y = {H, J} X = ℜ2 Training examples {( , ), , ( , )} 1 1 l l x y … x y Linear classifier: y = H 1 x 2 x y = J w x H if b ⎩ ⎨ ⎧ ( ⋅ ) + ≥ 0 w x ⋅ + = ( ) 0 q( ) J if b x (w⋅x)+b=0 © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 76. Components of a PR System Sensors and preprocessing Feature extraction Classifier Class assignment Teacher Learning algorithm Pattern • Sensors and preprocessing. • A feature extraction aims to create discriminative features good for classification. • A classifier. • A teacher provides information about hidden state -- supervised learning. • A learning algorithm sets PR from training examples. © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 77. Feature Extraction Task: to extract features which are good for classification. Good features: • Objects from the same class have similar feature values. • Objects from different classes have different values. “Good” features “Bad” features © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 78. Feature Extraction Methods ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ m 1 m # 2 k m ⎤ ⎤ ⎡ m 1 n φ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎡ 1 1 φ ⎢ ⎢ ⎢ ⎢ ⎣ x x 2 # n x 2 φ ⎦ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ m 2 m # 3 k m ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ x 1 x 2 # n x Feature extraction Feature selection φ(θ) Problem can be expressed as optimization of parameters of featrure extractor . Supervised methods: objective function is a criterion of separability (discriminability) of labeled examples, e.g., linear discriminant analysis (LDA). Unsupervised methods: lower dimesional representation which preserves important characteristics of input data is sought for, e.g., principal component analysis (PCA). © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 79. Classifier A classifier partitions feature space X into class-labeled regions such that 1 2 |Y| X = X ∪X ∪…∪X {0} 1 2 | | ∩ ∩ ∩ = Y and X X … X 1 X 3 X 2 X 1 X 1 X 2 X 3 X The classification consists of determining to which region a feature vector x belongs to. Borders between decision boundaries are called decision regions. © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 80. Representation of a Classifier A classifier is typically represented as a set of discriminant functions f ( ) : X →ℜ,i =1,…,|Y | i x The classifier assigns a feature vector x to the i-the class if f (x) f (x) i j ∀j ≠ i f ( ) 1 x f ( ) 2 x x max y # Class identifier f ( ) | | x Y Feature vector Discriminant function © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 81. Bayesian Decision Making • The Bayesian decision making is a fundamental statistical approach which allows to design the optimal classifier if complete statistical model is known. Definition: Obsevations Hidden states Decisions A loss function A decision rule D A joint probability q : X →D p(x,y) X Y W :Y ×D →R Task: to design decision rule q which minimizes Bayesian risk Σ Σ ∈ ∈ R(q) = p(x, y)W(q(x), y) y Y x X © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 82. Example of a Bayesian Task Task: minimization of classification error. A set of decisions D is the same as set of hidden states Y. 0 q( x ) 0/1 - loss function used ⎩ ⎨ ⎧ if = y ≠ = if y y 1 q( ) W(q( ), ) x x The Bayesian risk R(q) corresponds to probability of misclassification. The solution of Bayesian task is y p y x x y y q argminR(q) * argmax ( | ) argmax p( | ) p( ) p( ) = ⇒ = = q * x y y © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 83. Limitations of the Bayesian Approach • The statistical model p(x,y) is mostly not known therefore learning must be employed to estimate p(x,y) from training examples {(x1,y1),…,(xA,yA)} -- plug-in Bayes. • Non-Bayesian methods offers further task formulations: • A partial statistical model is avaliable only: • p(y) is not known or does not exist. • p(x|y,θ) is influenced by a non-random intervetion θ. • The loss function is not defined. • Examples: Neyman-Pearson‘s task, Minimax task, etc. © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 84. Discriminative Approaches Given a class of classification rules q(x;θ) parametrized by θ∈Ξ the task is to find the “best” parameter θ* based on a set of training examples {(x1,y1),…,(xA,yA)} -- supervised learning. The task of learning: recognition which classification rule is to be used. The way how to perform the learning is determined by a selected inductive principle. © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 85. Empirical Risk Minimization Principle The true expected risk R(q) is approximated by empirical risk emp W(q( ; ), ) R (q( ; )) 1 Σ= x θ = x i θ y i A A i 1 with respect to a given labeled training set {(x1,y1),…,(xA,yA)}. The learning based on the empirical minimization principle is defined as θ* x θ argmin R (q( ; )) emp θ = Examples of algorithms: Perceptron, Back-propagation, etc. © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 86. Overfitting and Underfitting Problem: how rich class of classifications q(x;θ) to use. underfitting good fit overfitting Problem of generalization: a small emprical risk Remp does not imply small true expected risk R. © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 87. Structural Risk Minimization Principle Statistical learning theory -- Vapnik Chervonenkis. An upper bound on the expected risk of a classification rule q∈Q R(q) R (q) R (1 , , log 1 ) σ h emp str A ≤ + where A is number of training examples, h is VC-dimension of class of functions Q and 1-σ is confidence of the upper bound. SRM principle: from a given nested function classes Q1,Q2,…,Qm, such that m h1 ≤ h2 ≤…≤ h select a rule q* which minimizes the upper bound on the expected risk. © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 88. Unsupervised Learning Input: training examples {x1,…,xA} without information about the hidden state. Clustering: goal is to find clusters of data sharing similar properties. A broad class of unsupervised learning algorithms: { , , } x1 … xA { , , } y1 … yA Classifier θ Learning algorithm Classifier q : X ×Θ →Y L : (X ×Y)A →Θ Learning algorithm (supervised) © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 89. Example k-Means Clustering: Classifier = x = x − y w q( ) arg min || || i i k 1, , = … Goal is to minimize 2 A Σ − x x q( ) 1 || || i i i w = Learning algorithm 1 , | | i = Σ x I I i j i j w ∈ { j : q( ) i} i j I = x = 1 w 2 w 3 w { , , } x1 … xA 1 { , , } k θ = w … w { , , } y1 … yA © Voitech Franc, cmp.felk.cvut.cz/~xfrancv/talks/franc-printro03.ppt
  • 90. Neural Network References • Neural Networks, a Comprehensive Foundation, S. Haykin, ed. Prentice Hall (1999) • Neural Networks for Pattern Recognition, C. M. Bishop, ed Claredon Press, Oxford (1997) • Self Organizing Maps, T. Kohonen, Springer (2001)
  • 91. Some ANN Toolboxes • Free software ƒ SNNS: Stuttgarter Neural Network Systems Java NNS ƒ GNG at Uni Bochum • Matlab toolboxes ƒ Fuzzy Logic ƒ Artificial Neural Networks ƒ Signal Processing
  • 92. Pattern Recognition / Vector Quantization References Textbooks Duda, Heart: Pattern Classification and Scene Analysis. J. Wiley Sons, New York, 1982. (2nd edition 2000). Fukunaga: Introduction to Statistical Pattern Recognition. Academic Press, 1990. Bishop: Neural Networks for Pattern Recognition. Claredon Press, Oxford, 1997. Schlesinger, Hlaváč: Ten lectures on statistical and structural pattern recognition. Kluwer Academic Publisher, 2002.