Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Clara_de_Paiva_Master_thesis_presentation_June2016

  • Login to see the comments

  • Be the first to like this

Clara_de_Paiva_Master_thesis_presentation_June2016

  1. 1. Hierarchical Associative Memories and Sparse code Clara de Paiva Técnico Lisboa – Alameda campus 1
  2. 2. Associative Memories in neural networks 1. 2
  3. 3. Hierarchical memories and Sparse Code 1. Associative memories 𝑥 input output 𝒙 𝟏 0 𝒙 𝟐 1 𝒙 𝟑 1 𝒙 𝟒 0 𝒙 𝟓 0 y1 0 y2 0 y3 0 y4 0 y5 1 y6 0 𝑦 3
  4. 4. Hierarchical memories and Sparse Code 1. Associative memories 𝑥i 𝒚j wji Weight: strength of link from i to j: component i of the input vector component j of the output vector 4
  5. 5. Hierarchical memories and Sparse Code 1. Associative memories input 𝑥 𝜇 𝒙 𝟏 𝜇 0 𝒙 𝟐 𝜇 1 𝒙 𝟑 𝜇 1 𝒙 𝟒 𝜇 0 𝒙 𝟓 𝜇 0 output 𝑦 𝜇 𝒚 𝟏 𝜇 0 𝒚 𝟐 𝜇 0 𝒚 𝟑 𝜇 0 𝒚 𝟒 𝜇 0 𝒚 𝟓 𝜇 1 𝒚 𝟔 𝜇 0 xi yj wji w11 w12 w14 w13 w15 Link from i to j: component i of the input vector component j of the output vector 5
  6. 6. 2. Lernmatrix w11 w21 ⋯ ⋯ ⋯ wm1 w12 w22 ⋯ ⋯ ⋯ wm2 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ w1i w2i ⋯ wji ⋯ wmi ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ w1n w2n ⋯ ⋯ ⋯ wmn Correlation matrix W Hierarchical memories and Sparse Code i j Input components Output components (neurons) (dendrites) 6
  7. 7. (Hebb’s learning rule for correlations) Hierarchical memories and Sparse Code 1. Associative memories 0 1 1 0 0 W𝒙1 𝒚1 0 0 0 0 0 0 𝟏 0 0 0 0 0 𝟏 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 fire-shots’ coincidences 7
  8. 8. Hierarchical memories and Sparse Code 1. Associative memories W 𝒙2 𝒚1 𝒙1 W 𝒚2 1 0 0 0 0 1 2 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 8
  9. 9. Behavior of matricial associative memories • The memory load increases with the number of patterns learned Hierarchical memories and Sparse Code 1. Associative memories 0,0% 0,1% 0,2% 0,3% 0,4% 0,5% 0,6% 0,7% 0 200 400 600 800 1000 1200 1400 1600 1800 p1-memoryload p - number of patterns learned (stored) Memory load p1 in function of p Probability that a synapse is active after storing p patterns 𝑝1 = 1 − (1 − 𝐾𝐿 𝑀𝑁 ) 𝑝 𝒑𝟏 = 𝟏 − (𝟏 − 𝑲 𝟐 𝑵 𝟐 ) 𝒑 9
  10. 10. Hierarchical memories and Sparse Code 1. Associative memories • The quality of the retrieved outputs deteriorates with increasing memory load Numberofadd-errorsinoutput Memory load in r=3 10
  11. 11. Research question • How can we increase network performance? i.e, increase the number of stored patterns using the same number of computations without compromising the quality of the retrieval? ↔ • How can we reduce the number of computational steps for the same number of stored patterns? Hierarchical memories and Sparse Code 11
  12. 12. Research question • How can we increase network performance? i.e, increase the number of stored patterns using the same number of computations without compromising the quality of the retrieval? ↔ • How can we reduce the number of computational steps for the same number of stored patterns? Solution • Reorganizing matrices Hierarchical memories and Sparse Code 12
  13. 13. Lernmatrix Karl Steinbuch (1958), Willshaw et al. (1969) the “learning matrix” neural network 2. 13
  14. 14. 2. a) Learning phase 2. b) Retrieval phase 2. c) Optimal capacity Hierarchical memories and Sparse Code 2. Lernmatrix 14
  15. 15. 2. a) Learning phase Initialization Hierarchical memories and Sparse Code 2. Lernmatrix 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15
  16. 16. 2. a) Learning with weights • Hebb’s learning rule Hierarchical memories and Sparse Code 2. Lernmatrix W 𝒙2 𝒚1 𝒙1 W 𝒚2 1 0 0 0 0 1 2 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 16
  17. 17. 2. a) Learning with weights • Hebb’s learning rule • Weights’ mathematical update wnew ij = wold ij + yixj Normalizing: • Clipped: wnew ij = min (1, wold ij + yixj) • Or, equivalently, OR-based: wnew ij = wold ij ∨ (yi ∧ xj) Hierarchical memories and Sparse Code 2. Lernmatrix 17
  18. 18. 2. b) Retrieval • A thresholded decision Components of 𝒚 as threshold functions of the current weighted input 𝑥 : 𝑦𝑖 = 1 𝑖𝑓 𝑗=1 𝑛 𝑤𝑖𝑗 𝑥𝑗 ≥ 𝜃𝑖 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Hierarchical memories and Sparse Code 2. Lernmatrix ~ ~ dendritic potential threshold 18
  19. 19. 2. b) Retrieval • A thresholded decision Hierarchical memories and Sparse Code 2. Lernmatrix Dendritic Potential 𝑑i x1 x2 xn xj Σ wi1 wi2 win wij ... ... ... ... Neurone i Input Output yi Vi Threshold 𝜃𝑖 1 0 𝑦𝑖 = 1 𝑖𝑓 𝑗=1 𝑛 𝑤𝑖𝑗 𝑥𝑗 ≥ 𝜃𝑖 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 ~ × × × ×~ ~ ~ ~ 19
  20. 20. 2. c) Optimal capacity Maximum number Cstor of associations stored is O(ln N2): Cstor = 𝑃 𝑀 = ln(2) 𝑁2 𝐾2 with: P – number of associations M – number of neurons (dimension of output); N – dimension of input K – activity level (number of 1s per vector) for an optimal O(ln N) activity level K: 𝐾 = 𝑙𝑜𝑔2 𝑛 4 (𝒔𝒑𝒂𝒓𝒔𝒆 𝒄𝒐𝒅𝒆) Hierarchical memories and Sparse Code 2. Lernmatrix 20
  21. 21. Hierarchical associative memory3. 21
  22. 22. 3. a) Structure 3. b) Training 3. c) Retrieval Hierarchical memories and Sparse Code 3. Hierarchical memory 22
  23. 23. 3. b) Structure • State of the network as a set of R correlation matrices: 𝑊 = 𝑊1 , 𝑊2 , … , 𝑊 𝑟 , … , 𝑊 𝑅−1 , 𝑊 𝑅 with: 𝑟 = 1, … , 𝑅 • Matrices hierarchically content-sized: dimensions (𝑊 𝑟 ) = 𝑚 × 𝑛 𝑟 with: 𝑛1 < 𝑛2 < ... < 𝑛 𝑅−1 < 𝑛 𝑅 𝑚 = size of the input Hierarchical memories and Sparse Code 3. Hierarchical memory 23
  24. 24. Hierarchical memories and Sparse Code 3. Hierarchical memory 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 𝑊 𝑟=1 𝑦 𝑟=2 𝑥 𝑦 𝑟=1 , size: 𝑛2 𝑎1 = 16/4 = 4 𝑥 𝑊 𝑟=2 This top layer (R) corresponds exactly to the Lernmatrix Same clipped Hebb learning rule applied: 𝑤𝑖𝑗 𝑛𝑒𝑤 = min (1, 𝑤𝑖𝑗 𝑜𝑙𝑑 + 𝑦𝑖 𝑥𝑗) 3. c) Learning Layer: r=R=2 24
  25. 25. Hierarchical memories and Sparse Code 3. Hierarchical memory 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 𝑊 𝑟=1 𝑦 𝑟=2 𝑥 𝑦 𝑟=1 , size: 𝑛2 𝑎1 = 16/4 = 4 𝑥 𝑊 𝑟=2 OR-based aggregation Hierarchical memory with: • number of layers R = 2 • aggregation factor 𝑎1 = 4 • threshold 𝜃𝑖 = 2 Successively (from r = R − 1 to r = 1), learning based on compressed versions of y: 𝑦𝑗2 𝑟 = 𝜁 𝑦𝑗1 𝑟+1 = 𝑗1=𝑎 𝑟 −(𝑎 𝑟 −1) 𝑗1=𝑎 𝑟 𝑦𝑗1 𝑟+1 with 𝑎 𝑟 constant (size of the aggregation window: 𝑛 𝑟 = 𝑛 𝑟+1 𝑎 𝑟 ) Layer: r=1 25
  26. 26. Hierarchical memories and Sparse Code 3. Hierarchical memory 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3. d) Retrieval • Starts at lowest-resolution layer (𝑟 = 1) 1 0 0 0 1 0 0 1 𝑥~Input cue: 26
  27. 27. Hierarchical memories and Sparse Code 3. Hierarchical memory 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3. d) Retrieval • Starts at lowest-resolution layer (𝑟 = 1) 1 0 0 0 1 0 0 1 0 1 0 0 𝑥~Input cue: Output obtained 27
  28. 28. Hierarchical memories and Sparse Code 3. Hierarchical memory 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3. d) Retrieval • Starts at lowest-resolution layer (𝑟 = 1) • At each layer 𝑟: • 𝑦𝑗 𝑟 = 0 ↔ 𝑦𝑗 𝑟+1 = 0, ∀ 𝑗 ∈ {𝑗𝑎 𝑟 − (𝑎 𝑟 −1), ... , 𝑗𝑎 𝑟} Hence, the search along this window can be pruned. 1 0 0 0 1 0 0 1 0 1 0 0 𝑥~ 28
  29. 29. Hierarchical memories and Sparse Code 3. Hierarchical memory 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3. d) Retrieval • Starts at lowest-resolution layer (𝑟 = 1) • At each layer 𝑟: • 𝑦𝑗 𝑟 = 0 ↔ 𝑦𝑗 𝑟+1 = 0, ∀ 𝑗 ∈ {𝑗𝑎 𝑟 − (𝑎 𝑟 −1), ... , 𝑗𝑎 𝑟} Hence, the search along this window can be pruned. • 𝑦𝑗 𝑟 = 1 ↔ ∃ 𝑗 ∈ {𝑗𝑎 𝑟 − (𝑎 𝑟 −1), ... , 𝑗𝑎 𝑟} : 𝑦𝑗 𝑟+1 = 1 Hence, the search along this window cannot be pruned, occuring as for the lernmatrix (dendritic sum and thresholded decision) 1 0 0 0 1 0 0 1 0 1 0 0 𝑥~ 29
  30. 30. Hierarchical memories and Sparse Code 3. Hierarchical memory 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3. d) Retrieval 1 0 0 0 1 0 0 1 𝑥~ • Starts at lowest-resolution layer (𝑟 = 1) • At each layer 𝑟: • 𝑦𝑗 𝑟 = 0 ↔ 𝑦𝑗 𝑟+1 = 0, ∀ 𝑗 ∈ {𝑗𝑎 𝑟 − (𝑎 𝑟 −1), ... , 𝑗𝑎 𝑟} Hence, the search along this window can be pruned. • 𝑦𝑗 𝑟 = 1 ↔ ∃ 𝑗 ∈ {𝑗𝑎 𝑟 − (𝑎 𝑟 −1), ... , 𝑗𝑎 𝑟} : 𝑦𝑗 𝑟+1 = 1 Hence, the search along this window cannot be pruned, occuring as for the lernmatrix (dendritic sum and thresholded decision) 0 1 0 0 30
  31. 31. Ordered Indexes Hierarchical Associative Memory 4. 31
  32. 32. 4. a) Motivation Hierarchical memories and Sparse Code 4. Ordered indexes memory 32
  33. 33. 4. a) Motivation Hierarchical memories and Sparse Code 4. Ordered indexes memory The matrix is sparse and yet... There may be no pruning! 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 1 1 0 0 0 0 0 1 1 0 1 0 0 0 0 1 0 0 33
  34. 34. 4. b) Idea Hierarchical memories and Sparse Code 4. Ordered indexes memory 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 information zeros From here... ... to there 34
  35. 35. Hierarchical memories and Sparse Code 4. Ordered indexes memory 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 information zeros From here More zeros More null-columns p1 = 𝟏𝟏 𝟑𝟐 p1 = 𝟗 𝟑𝟐 35
  36. 36. Hierarchical memories and Sparse Code 4. Ordered indexes memory ... ??? 36
  37. 37. 2. a) Learning with weights • Hebb’s learning rule Hierarchical memories and Sparse Code 4. Ordered indexes memory W 𝒙2 𝒚1 𝒙1 W 𝒚2 1 0 0 0 0 1 2 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 This is a codification for the positions of the correlations! 37
  38. 38. 4. a) Motivation 4. b) Idea 4. c) Solution: Ordered Indexes Hierarchical Associative Memory Hierarchical memories and Sparse Code 4. Ordered indexes memory 38
  39. 39. 4.c) Solution: Ordered Indexes Hierarchical Associative Memory Hierarchical memories and Sparse Code 4. Ordered indexes memory 1 0 0 1 0 0 0 0 1 1 1 0 1 0 0 1 1 0 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 39
  40. 40. 4.d) Solution: Ordered Indexes Hierarchical Associative Memory Hierarchical memories and Sparse Code 4. Ordered indexes memory Indexes 1 2 3 4 5 6 7 8 9 10 11 12 1 1 0 0 1 0 0 0 0 1 1 1 0 2 1 0 0 1 1 0 0 0 1 1 0 0 3 1 1 1 0 0 0 1 0 0 0 1 0 4 0 0 1 0 0 0 0 1 0 0 0 0 5 0 0 0 0 1 0 0 1 0 0 0 1 40
  41. 41. 4.c) Solution: Ordered Indexes Hierarchical Associative Memory Hierarchical memories and Sparse Code 4. Ordered indexes memory Auxiliary structures: 1 2 3 4 5 6 7 8 9 10 11 12 AllSequenceListinitial Sequence of all the column-indexes subsequence node 41
  42. 42. Hierarchical memories and Sparse Code 4. Ordered indexes memory For each Line L of the Wr=3 For each Subsequence SS in L If (Ones and Zeros unclustered?(SS)) SplitAndOrder(SS); Base of the algorithm: 42
  43. 43. Hierarchical memories and Sparse Code 4. Ordered indexes memory For each Line L of the Wr=3 For each Subsequence SS in L If (Ones and Zeros unclustered?(SS)) SplitAndOrder(SS); Base of the algorithm: Note: Variants of the model are more intelligent ways to select the next line to be tested. 43
  44. 44. Hierarchical memories and Sparse Code 4. Ordered indexes memory Illustrative example: Indexes 1 2 3 4 5 6 7 8 9 10 11 12 1 1 0 0 1 0 0 0 0 1 1 1 0 2 1 0 0 1 1 0 0 0 1 1 0 0 3 1 1 1 0 0 0 1 0 0 0 1 0 4 0 0 1 0 0 0 0 1 0 0 0 0 5 0 0 0 0 1 0 0 1 0 0 0 1 Lernmatrix after learning phase. 44
  45. 45. Hierarchical memories and Sparse Code 4. Ordered indexes memory Iteration 1: line 1 Indexes 1 2 3 4 5 6 7 8 9 10 11 12 1 1 0 0 1 0 0 0 0 1 1 1 0 45
  46. 46. Hierarchical memories and Sparse Code 4. Ordered indexes memory Iteration 1: line 1 Indexes 1 2 3 4 5 6 7 8 9 10 11 12 1 1 0 0 1 0 0 0 0 1 1 1 0 Zeros and Ondes unclustered? YES 46
  47. 47. Hierarchical memories and Sparse Code 4. Ordered indexes memory Iteration 1: line 1 Indexes 1 2 3 4 5 6 7 8 9 10 11 12 1 1 0 0 1 0 0 0 0 1 1 1 0 Zeros and Ones unclustered? YES Split and Order. 47
  48. 48. Hierarchical memories and Sparse Code 4. Ordered indexes memory Iteration 1: line 1 : Indexes 1 2 3 4 5 6 7 8 9 10 11 12 1 1 0 0 1 0 0 0 0 1 1 1 0 Ordering: Indexes 1 4 9 10 11 2 3 5 6 7 8 12 1 1 1 1 1 1 0 0 0 0 0 0 0 48
  49. 49. Hierarchical memories and Sparse Code 4. Ordered indexes memory Iteration 1: line 1 : Indexes 1 2 3 4 5 6 7 8 9 10 11 12 1 1 0 0 1 0 0 0 0 1 1 1 0 Ordering: Indexes 1 4 9 10 11 2 3 5 6 7 8 12 1 1 1 1 1 1 0 0 0 0 0 0 0 49
  50. 50. Hierarchical memories and Sparse Code 4. Ordered indexes memory Iteration 1: line 1 : Ordered sequence in node: Indexes 1 4 9 10 11 2 3 5 6 7 8 12 1 1 1 1 1 1 0 0 0 0 0 0 0 1 4 9 10 11 2 3 5 6 7 8 12 AllSequenceListiteration1: 50
  51. 51. Hierarchical memories and Sparse Code 4. Ordered indexes memory Iteration 1: line 1 : Ordered sequence in node: Indexes 1 4 9 10 11 2 3 5 6 7 8 12 1 1 1 1 1 1 0 0 0 0 0 0 0 1 4 9 10 11 2 3 5 6 7 8 12 AllSequenceListiteration1: 51
  52. 52. Hierarchical memories and Sparse Code 4. Ordered indexes memory Iteration 2: line 2 Indexes 1 2 3 4 5 6 7 8 9 10 11 12 2 1 0 0 1 1 0 0 0 1 1 0 0 52
  53. 53. Hierarchical memories and Sparse Code 4. Ordered indexes memory Iteration 2: line 2 Indexes 1 2 3 4 5 6 7 8 9 10 11 12 2 1 0 0 1 1 0 0 0 1 1 0 0 1 4 9 10 11 2 3 5 6 7 8 12 AllSequenceListiteration1: 53
  54. 54. Hierarchical memories and Sparse Code 4. Ordered indexes memory Indexes 1 2 3 4 5 6 7 8 9 10 11 12 2 1 0 0 1 1 0 0 0 1 1 0 0 1 4 9 10 11 2 3 5 6 7 8 12 AllSequenceListiteration1: Line 2: 1 4 9 10 11 1 1 1 1 0 2 3 5 6 7 8 12 0 0 1 0 0 0 0 AllSequenceListiteration1 with content of line2: 54
  55. 55. Hierarchical memories and Sparse Code 4. Ordered indexes memory 1 4 9 10 11 1 1 1 1 0 2 3 5 6 7 8 12 0 0 1 0 0 0 0 AllSequenceListiteration2: AllSequenceListiteration1 with content of line2: 1 4 9 10 1 1 1 1 6 7 8 12 0 0 0 0 11 0 2 3 0 0 5 1 2 nodes 55
  56. 56. Hierarchical memories and Sparse Code 4. Ordered indexes memory AllSequenceListiteration3: AllSequenceListiteration2 with content of line3: 1 4 9 10 1 0 0 0 6 7 8 12 0 1 0 0 11 1 2 3 1 1 5 0 1 1 7 1 11 1 2 3 1 1 5 0 8 12 0 0 4 9 10 0 0 0 6 0 Iteration 3: line 3 56
  57. 57. Hierarchical memories and Sparse Code 4. Ordered indexes memory AllSequenceListiteration4: AllSequenceListiteration3 with content of line4: 1 0 7 0 11 0 2 3 0 1 5 0 8 12 1 0 4 9 10 0 0 0 6 0 e 4 1 0 7 0 11 0 2 0 5 0 12 0 4 9 10 0 0 0 6 0 3 1 8 1 Iteration 4: line 4 57
  58. 58. Hierarchical memories and Sparse Code 4. Ordered indexes memory AllSequenceListiteration5: AllSequenceListiteration4 with content of line5: e 4 Iteration 5: line 5 1 0 7 0 11 0 2 0 5 1 12 1 4 9 10 0 0 0 6 0 3 0 8 1 1 0 7 0 11 0 2 0 5 1 12 1 4 9 10 0 0 0 6 0 3 0 8 1 58
  59. 59. Hierarchical memories and Sparse Code 4. Ordered indexes memory Final sequence of column-indexes: 1, 4, 9, 10, 11, 5, 3, 2, 7, 8, 12, 6. 1 4 9 10 11 5 3 2 7 8 12 6 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 1 0 59
  60. 60. Hierarchical memories and Sparse Code 4. Ordered indexes memory Un-ordered Ordered 1 4 9 10 11 5 3 2 7 8 12 6 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 1 0 0 1 0 0 0 0 1 1 1 0 1 0 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 60
  61. 61. Hierarchical memories and Sparse Code 4. Ordered indexes memory 1 4 9 10 11 5 3 2 7 8 12 6 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 1 0 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 0 0 1 1 0 1 0 1 1 0 1 0 1 OR-based learning 61
  62. 62. Hierarchical memories and Sparse Code 4. Ordered indexes memory Retrieval • Uses filtering as before 1 4 9 10 11 5 3 2 7 8 12 6 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 1 0 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 0 0 1 1 0 1 0 1 1 0 1 0 1 62
  63. 63. Hierarchical memories and Sparse Code 4. Ordered indexes memory Retrieval • Uses filtering as before • Only needs the new sequence of indexes (AllSequenceList) to restore the correct order of the components in the output 1 4 9 10 11 5 3 2 7 8 12 6 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 1 0 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 0 0 1 1 0 1 0 1 1 0 1 0 1 1 4 9 10 11 5 3 2 7 8 12 6 63
  64. 64. Hierarchical memories and Sparse Code 4. Ordered indexes memory Retrieval 1 4 9 10 11 5 3 2 7 8 12 6 Column considered 1 2 3 4 5 6 7 8 9 10 11 12 Column meant (because of reordering) Mappings <column considered> mapped to <column meant> <column meant> = AllSequenceList[<column considered>] ↔ 64
  65. 65. 6 5 Hierarchical memories and Sparse Code 4. Ordered indexes memory Retrieval 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 i = 6 i = 11 11 12 i = 12i = 5 1 4 9 10 11 5 3 2 7 8 12 6 Column considered 1 2 3 4 5 6 7 8 9 10 11 12 Column meant 65
  66. 66. Hierarchical memories and Sparse Code 4. Ordered indexes memory r=3 r=2 r=1 Value of 𝑤𝑖𝑗: - 1 - 0 66
  67. 67. Hierarchical memories and Sparse Code 4. Ordered indexes memory r=3 r=2 r=1 Value of 𝑤𝑖𝑗: - 1 - 0 67
  68. 68. 4. a) Motivation 4. b) Idea 4. c) Solution: Ordered Indexes Hierarchical Associative Memory 4. d) Empirical experiments Hierarchical memories and Sparse Code 4. Ordered indexes memory 68
  69. 69. 4. d) Empirical experiments • Method: test: • quality (add errors) (the same as before) • performance (number of computations) of retrieval • additions • multiplications • threshold-comparisons • fire-shots Hierarchical memories and Sparse Code 4. Ordered indexes memory 69
  70. 70. 4. d) Empirical experiments • (default) Experience – in Julia programing language • Data base of 1600 patterns • 120 tests • For each test: • Performance and Quality … in function of memory load i.e, probability of 𝑤𝑖𝑗 = 1, ∀𝑖, 𝑗. Memory load variable thanks to a 𝑝 (number of patterns learned) variable for retrieving 20 patterns • Fixed N (number of neurons) • Fixed K (activity level or number of 1s per vector). With Gauss distribution • Each test in run by 5 models… Hierarchical memories and Sparse Code 4. Ordered indexes memory 70
  71. 71. 4. d) Empirical experiments (...) • Each test in run by 5 models • Lernmatrix • Hierarchical Associative Memory • 3 models for Ordered indexes Hierarchical Ass. Memory • Lines for iteration naively chosen • Lines for iteration with more 1s chosen first • Lines for iteration with more 0s chosen first + right null columns discarded Hierarchical memories and Sparse Code 4. Ordered indexes memory 71
  72. 72. Hierarchical memories and Sparse Code 4. Ordered indexes memory Total number of steps in function of memory load in layer r=3 For each test, and for each model - Lernmatrix - Hierarchical Ass. Mem. - (naively) Ordered H.A.M - (1s first) Ordered H.A.M - Ord. H.A.M with right null-columns discarded 72
  73. 73. 4. d) Empirical experiments • Observations (and discussion) Hierarchical memories and Sparse Code 4. Ordered indexes memory 73
  74. 74. Hierarchical memories and Sparse Code 4. Ordered indexes memory - Lernmatrix - Hierarchical Ass. Mem. - (naively) Ordered H.A.M - (1s first) Ordered H.A.M - Ord. H.A.M with right null-columns discarded Curves of Hierarchical models >> curve of Lernmatrix (≈-80% steps) Hierarchical models: 74
  75. 75. Hierarchical memories and Sparse Code 4. Ordered indexes memory - Lernmatrix - Hierarchical Ass. Mem. - (naively) Ordered H.A.M - (1s first) Ordered H.A.M - Ord. H.A.M with right null-columns discarded Curves of Hierarchical models >> curve of Lernmatrix (≈-80% steps) Why? Pruning of ≈ 80% columns 75
  76. 76. Hierarchical memories and Sparse Code 4. Ordered indexes memory - Hierarchical Ass. Mem. - (naively) Ordered H.A.M - (1s first) Ordered H.A.M - Ord. H.A.M with right null-columns discarded Original Ordered models Performance Ordered Hierarchical models >> original hierarchical model Why? 1. Reordering improoves the aggregations for pruning 76
  77. 77. Hierarchical memories and Sparse Code 4. Ordered indexes memory Shift Why? 1. Reordering otimizes aggregations for pruning 2. Reordering frees space y = total steps x = memory load in r=1 - Hierarchical Ass. Mem. (H.A.M) - (naively) Ordered H.A.M - (1s first) Ordered H.A.M - Ord. H.A.M with right null-columns discarded 77
  78. 78. Hierarchical memories and Sparse Code 4. Ordered indexes memory Performance model that discards columns > other ordered column-indexes models Why? Right of the matrix is not even visited - Hierarchical Ass. Mem. - (naively) Ordered H.A.M - (1s first) Ordered H.A.M - Ord. H.A.M with right null-columns discarded 78
  79. 79. Conclusion5. 79
  80. 80. 5.1) Achievements • Considerable savings • 2-20% (worst-case) of the total number of steps (Ordered column-indexes hierarchical model with right null-columns discarded, relatively to Lernmatrix.) • Worthy trade-offs • More resources spent in infrastructure and computations in Learning phase - done (only) once. • Overcome by benefits in retrievals Hierarchical memories and Sparse Code 5. Conclusion 80
  81. 81. 5.2) Future work • Variable aggregation factor • Adapt window to density of different zones of the matrices • Different distribution for correlations • Check how non-uniform activity patterns affect the models • Check cost of hierarchical models of neural networks in biology Hierarchical memories and Sparse Code 5. Conclusion 81
  82. 82. Thank you! Questions? 82

×