2. • Associative memory is defined as the ability to learn and remember the
relationship between unrelated items. for example, remembering the name of
someone or the aroma of a particular perfume.
• Associative memory deals specifically with the relationship between different
objects or concepts. A normal associative memory task involves testing
participants on their recall of pairs of unrelated items, such as face-name pairs.
• Associative memories are neural networks (NNs) for modeling the learning
and retrieval of memories in the brain. The retrieved memory and its query are
typically represented by binary, bipolar, or real vectors describing patterns of
neural activity.
Associative memory
3. • Learning is the process of forming associations between related patterns.
• The patterns we associate together may be of the same type or of different
types.
• Each association is an input-output vector pair, s:t.
• If each vector t is the same as the vector s with which it is associated, then the
net is called an autoassociative memory.
• If the t's are different from the s's, the net is called a heteroassociative
memory.
• In each of these cases, the net not only learns the specific pattern pairs that
were used for training, but also is able to recall the desired response pattern
when given an input stimulus that is similar, but not identical, to the training
input.
Pattern Association
4. • Similar to hebbian learning for classification.
• Algorithm: (bipolar or binary patterns)
• For each training samples s:t:
•
are ON (binary) or have the same sign (bipolar)
• Instead of obtaining W by iterative updates, it can be computed from the
training set by calculating the outer product of s and t.
Training Algorithms for Pattern Association
j
i
ij t
s
w
j
i
ij t
s
w and
both
if
increases
}
{
)
(
)
(
1
ij
P
P
j
i
ij w
W
p
t
p
s
w
5. • Outer product. Let s and t be row vectors.
Then for a particular training pair s:t
And
• It involves 3 nested loops p, i, j (order of p is irrelevant)
p= 1 to P /* for every training pair */
i = 1 to n /* for every row in W */
j = 1 to m /* for every element j in row i */
nm
n
m
m
n
n
m
m
m
n
T
w
w
w
w
t
s
t
s
t
s
t
s
t
s
t
s
t
t
s
s
p
t
p
s
p
W
......
......
......
......
......
,......
)
(
)
(
)
(
1
1
11
1
2
1
2
1
1
1
1
1
Outer product
6. • In its original form, the delta rule assumed that the activation function for the
output unit was the identity function.
• A simple extension allows for the use of any differentiable activation
function; we shall call this the extended delta rule.
Delta rule
7. Hetero-associative Memory
Associative memory neural networks are nets in which the weights
are determined in such a way that the net can store a set of P
pattern associations.
• Each association is a pair of vectors (s(p), t(p)), with p = 1, 2, . .
. , P.
• Each vector s(p) is an n-tuple (has n components), and each t(p)
is an m-tuple.
• The weights may be found using the Hebb rule or the delta rule.
9. • Binary pattern pairs s:t with |s| = 4 and |t| = 2.
• Total weighted input to output units:
• Activation function: threshold
• Weights are computed by Hebbian rule (sum of outer
products of all training pairs)
• Training samples:
i
ij
i
j w
x
in
y _
0
_
0
0
_
1
j
j
j
in
y
if
in
y
if
y
P
p
j
T
i p
t
p
s
W
1
)
(
)
(
s(p) t(p)
p=1 (1 0 0 0) (1, 0)
p=2 (1 1 0 0) (1, 0)
p=3 (0 0 0 1) (0, 1)
p=4 (0 0 1 1) (0, 1)
Example of hetero-associative memory
11. Recall:
x=(1 0 0 0) x=(0 1 0 0) (similar to S(1) and S(2)
x=(0 1 1 0)
0
,
1
0
2
2
0
1
0
0
1
0
2
0
0
0
1
2
1
y
y
0
,
1
0
1
2
0
1
0
0
1
0
2
0
0
1
0
2
1
y
y
1
,
1
1
1
2
0
1
0
0
1
0
2
0
1
1
0
2
1
y
y
(1 0 0 0), (1 1 0 0) class (1, 0)
(0 0 0 1), (0 0 1 1) class (0, 1)
(0 1 1 0) is not sufficiently similar
to any class
delta-rule would give same or
similar results.
Example of hetero-associative memory
12. • For an auto-associative net, the training input and target output vectors are
identical.
• The process of training is often called storing the vectors, which may be
binary or bipolar.
• The performance of the net is judged by its ability to reproduce a stored
pattern from noisy input; performance is, in general, better for bipolar vectors
than for binary vectors.
Auto-associative memory
13. Auto-associative memory
• Same as hetero-associative nets, except t(p) =s (p).
• Used to recall a pattern by a its noisy or incomplete version.
(pattern completion/pattern recovery)
• A single pattern s = (1, 1, 1, -1) is stored (weights computed by Hebbian
rule – outer product)
•
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
W
recognized
not
0
0
0
0
1
1
1
1
noisy
more
1
1
1
1
2
2
2
2
1
1
0
0
info
missing
1
1
1
1
2
2
2
2
1
1
1
1
pat
noisy
1
1
1
1
4
4
4
4
1
1
1
1
pat.
training
W
W
W
W
14. Auto-associative memory
• The preceding process of using the net can be written more succinctly
as:
• As before, the differences take one of two forms: "mistakes" in the data
or "missing" data.
• The only "mistakes" we consider are changes from + 1 to -1 or vice
versa.
• We use the term "missing" data to refer to a component that has the
value 0, rather than either + 1 or -1
15. • In some cases the net does not respond immediately to an input signal with a
stored target pattern, but the response may be enough like a stored pattern.
• Testing a recurrent auto-associative net: stored vector with second, third and
fourth components set to zero.
• The weight matrix to store the vector (1, 1, 1, -1) is
Iterative Auto-associative memory
16. • The vector (1,0,0,0) is an example of a vector formed from the stored
vector with three "missing" components (three zero entries).
• The performance of the net for this vector is given next.
• Input vector (1, 0, 0, 0):
• (1, 0, 0, 0).W = (0, 1, 1, -1) >> iterate
• (0, 1, 1, -1).W = (3,2,2, -2) >> (1, 1, 1, -1).
• Thus, for the input vector (1, 0, 0, 0), the net produces the "known" vector (
1, 1, 1, -1) as its response in two iterations.
Iterative Auto-associative memory
17. • First proposed by bart kosko
• Heteroassociative network
• It associates patterns from one set, set A, to patterns from another set,
set B, and vice versa
• Generalize and also produce correct outputs despite corrupted or
incomplete inputs
• Consists of two fully interconnected layers of processing elements
• There can also be a feedback link connecting each
Node to itself.
Bidirectional Associative Memory (BAM)
18. • The BAM mapping of an n dimensional input vector 𝑋𝑛 into the m
dimensional output vector 𝑌𝑚.
Bidirectional Associative Memory (BAM)
A BAM network (Each node may also be connected to itself)
19. BAM operation: (a) forward direction; (b) backward direction
• The input vector 𝑿(𝒑) is applied to the transpose of weight matrix 𝑾𝑻
to
produce an output vector 𝒀(𝒑)
• Then, the output vector 𝒀(𝒑) is applied to the weight matrix 𝑾 to produce a new input vector 𝑿(𝒑 + 𝟏)
This process is repeated until input and output vectors become unchanged (reach stable state)
• How does the BAM work?
Bidirectional Associative Memory (BAM)
20. • Store pattern pairs so that when n-dimensional vector X from set A
is presented as input, the BAM recalls m-dimensional vector Y
from set B, but when Y is presented as input, the BAM recalls X.
Basic idea behind the BAM
Bidirectional Associative Memory (BAM)
21. Step 1: Storage The BAM is required to store M pairs of patterns. For example, we
may wish to store fourpairs:
Bidirectional Associative Memory (BAM)
The BAM training algorithm
In this case, the BAM input layer must have six neurons and the output layer three neurons.
23. Step 2: Testing The BAM should be able to receive any vector from set A
and retrieve the associated vector from set B, and receive any vector from set
B and retrieve the associated vector from set A. Thus, first we need to
confirm that the BAM is able to recall 𝑌𝑚 when presented with 𝑋𝑚. That is,
For instance
Bidirectional Associative Memory (BAM)
24. Then, we confirm that the BAM recalls 𝑋𝑚 when presented with 𝑌𝑚. That is,
For instance
Bidirectional Associative Memory (BAM)
25. • Step 3: Retrieval: Present an unknown vector (probe) X to the BAM and
retrieve a stored association. The probe may present a corrupted or incomplete
version of a pattern from set A (or from set B) stored in the BAM. That is,
• Repeat the iteration until equilibrium, when input and output vectors remain unchanged
with further iterations. The input and output patterns will then represent an associated
pair.
Bidirectional Associative Memory (BAM)
26. The BAM is unconditionally stable (Kosko, 1992). This means that any set of
associations can be learned without risk of instability. This important quality
arises from the BAM using the transpose relationship between weight matrices in
forward and backward directions.
Let us now return to our example. Suppose we use vector X as a probe. It
represents a single error compared with the pattern 𝑋1 from set A:
This probe applied as the BAM input produces the output vector Y1 from set B.
The vector Y1 is then used as input to retrieve the vector X1 from set A. Thus,
the BAM is indeed capable of error correction.