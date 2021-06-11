Successfully reported this slideshow.
Reservoir Computing Fast Deep Learning for Sequences Claudio Gallicchio, University of Pisa (Italy)
About me ● Researcher at the Department of Computer Science, University of Pisa ● Machine Learning, Deep Learning, Neural ...
Reservoir Computing ● Convenient way of designing Neural Networks for sequential data ● Stability ● Efficiency
Deep Learning for Time-Series
Machine Learning Algorithms that learn from the data Classical Programming data rules answers Machine Learning data answer...
Deep Learning ● Learn representations from the data ● Progressive abstraction
Sequential Data
Recurrent Neural Networks ● State update: ℎ! = #" $!, ℎ!#$ ● Output function: &! = '"! ℎ! recurrent layer/cell %! &! ℎ! st...
Recurrent Neural Networks ● State update: (! = )*+ℎ ,! -() + (!#$ -)) ● Output function: /! = (* -)+ recurrent layer/cell ...
Computational Graph recurrent layer/cell %! &! ℎ! %" &" ℎ" %# &# ℎ# %! &! ℎ! … "!" """ ""# "!" "!" "!" """ """ ""# ""# ""#...
Forward Computation Fading/Exploding memory: ● the influence of inputs far in the past vanishes/explodes in the current st...
!! "! ℎ! !" "" !# "# ℎ# … !!" !!" !!" !"" !"" !"# !"# !"# $! $" $# ℎ" !$ "$ ℎ$ !!" !"# $$ !"" Backpropagation Through Time...
Approaches ● Gated architectures ○ LSTM, GRU ○ training is slow
Randomization in Deep Neural Networks
Deep Learning Deep Learning models achieved tremendous success over the years. This comes at very high cost in terms of ● ...
Example: embedded applications Source: https://bitalino.com/en/freestyle-kit-bt Source: https://www.eenewsembedded.com/new...
Complexity / Accuracy Tradeoff Accuracy Complexity Deep NNs Linear models SVMs-like Deep Randomized NNs
The Philosophy “Randomization is computationally cheaper than optimization” Rahimi, A. and Recht, B., 2008. Weighted sums ...
Randomization = Efficiency ● Training algorithms are cheaper and simpler ● Model transfer: don’t need to transmit all the ...
Historical note: the cortico-striatal model ● Fixed recurrent connections in the PFC ● Dopamine regulated connections betw...
Reservoir Computing
Reservoir Computing: focus on the dynamical system input layer reservoir readout fixed trainable Randomly initialized unde...
Echo State Network Jaeger, Herbert, and Harald Haas. Science 304.5667 (2004): 78-80.
Liquid State Machine Maass, Wolfgang, Thomas Natschläger, and Henry Markram. Neural computation 14.11 (2002): 2531-2560.
Fractal Prediction Machine Tino, Peter, and Georg Dorffner. Machine Learning 45.2 (2001): 187-217.
Echo State Networks (ESNs) Reservoir (! = tanh(,!5"# + (!$%5##) ● large layer of recurrent units ● sparsely connected ● ra...
Echo State Networks (ESNs) Readout /! = (!-#& ● linear combination of the reservoir state variables ● can be trained in cl...
Reservoir Initialization ● Random but stable ○ the state should not be sensitive to tiny input perturbations ● Control the...
Why does it work? Suffix-based Markovian organization of the state space of contraction reservoir mappings even prior to l...
Chaotic attractors
Distributed Intelligence Applications Dragone, Mauro, et al. "A cognitive robotic ecology approach to self-configuring and...
Robot localization in critical environments Dragone, Mauro, et al. ESANN. 2016.
Human Activity Recognition ● Classification of human daily activities from RSS data generated by sensors worn by the user ...
Clinical applications ● Automatic assessment of balance skills ● Predict the outcome of the Berg Balance Scale (BBS) clini...
https://www.linkedin.com/company/teaching-horizon-2020/ https://twitter.com/TEACHING_H2020 https://www.teaching-h2020.eu
RC in Autonomous Vehicles ● Automatic detection of physiological, emotional, cognitive state of the human à Human-centric ...
Distributed, embeddable and federated learning
…and Beyond
Physical Reservoir Computing Tanaka, G., Yamane, T., Héroux, J.B., Nakane, R., Kanazawa, N., Takeda, S., Numata, H., Nakan...
Workshop W1: Deep Learning in Unconventional Neuromorphic Hardware Friday, July 23, 12:30PM-4:30PM, Room: IJCNN Virtual Ro...
Depth in RNNs shallow deep input deep readout deep reservoir Pascanu, R., Gulcehre, C., Cho, K. and Bengio, Y., 2013. How ...
Deep Echo State Networks input layer reservoir layer 1 reservoir layer L fixed readout Gallicchio, Claudio, Alessio Michel...
Multiple time-scales ● Effects of input perturbations last longer in the higher reservoir layers ● Multiple time-scales re...
Implementation https://github.com/gallicch/DeepRC-TF
Structured data time-series graphs
Neural networks for graphs , { }
Vertex-wise graph encoding ● time-step → vertex ● previous time step → neighborhood v1 v2 v3 v4 v embedding (state) input ...
It’s accurate Gallicchio, C. and Micheli, A., 2020. Fast and Deep Graph Neural Networks. In AAAI (pp. 3898-3905).
Gallicchio, C. and Micheli, A., 2020. Fast and Deep Graph Neural Networks. In AAAI (pp. 3898-3905).
It’s fast Gallicchio, C. and Micheli, A., 2020. Fast and Deep Graph Neural Networks. In AAAI (pp. 3898-3905).
Conclusions
Summary ● Reservoir Computing: paradigm for designing and training RNNs ○ fixed hidden recurrent layer (controlled for asy...
Deep Randomized Neural Networks Gallicchio, C. and Scardapane, S., 2020. Deep Randomized Neural Networks. In Recent Trends...
IEEE Task Force on Randomization-based Neural Networks and Learning Systems Promote the research and applications of deep ...
IEEE Task Force on Reservoir Computing Promote and stimulate the development of Reservoir Computing research under both th...
Reservoir Computing Fast Deep Learning for Sequences Claudio Gallicchio gallicch@di.unipi.it https://www.linkedin.com/in/c...
  1. 1. Reservoir Computing Fast Deep Learning for Sequences Claudio Gallicchio, University of Pisa (Italy)
  2. 2. About me ● Researcher at the Department of Computer Science, University of Pisa ● Machine Learning, Deep Learning, Neural Networks, Dynamical Systems ○ Reservoir Computing ○ Deep Randomized Neural Networks ○ Learning in Structured Domains ● IEEE Task Forces ○ Chair of the IEEE Task Force on Reservoir Computing ○ Vice-Chair of the IEEE Task Force on Randomization-Based Neural Networks and Learning Systems ● Workshops, Tutorials ○ DL in Unconventional Neuromorphic Hardware (IJCNN-21) ○ ML for irregular time-series (ECML PKDD-21) ○ Deep Randomized Neural Networks (AAAI-21) gallicch@di.unipi.it
  3. 3. Reservoir Computing ● Convenient way of designing Neural Networks for sequential data ● Stability ● Efficiency
  4. 4. Deep Learning for Time-Series
  5. 5. Machine Learning Algorithms that learn from the data Classical Programming data rules answers Machine Learning data answers rules
  6. 6. Deep Learning ● Learn representations from the data ● Progressive abstraction
  7. 7. Sequential Data
  8. 8. Recurrent Neural Networks ● State update: ℎ! = #" $!, ℎ!#$ ● Output function: &! = '"! ℎ! recurrent layer/cell %! &! ℎ! state input old state set of parameters output state
  9. 9. Recurrent Neural Networks ● State update: (! = )*+ℎ ,! -() + (!#$ -)) ● Output function: /! = (* -)+ recurrent layer/cell %! &! ℎ! input weight matrix recurrent weight matrix output weight matrix "!" """ ""#
  10. 10. Computational Graph recurrent layer/cell %! &! ℎ! %" &" ℎ" %# &# ℎ# %! &! ℎ! … "!" """ ""# "!" "!" "!" """ """ ""# ""# ""# ," ,# ,! L
  11. 11. Forward Computation Fading/Exploding memory: ● the influence of inputs far in the past vanishes/explodes in the current state ● many (non-linear) transformations !! "! ℎ! !" "" !# "# ℎ# … !!" !!" !!" !"" !"" !"# !"# !"# ℎ" !$ "$ ℎ$ !!" !"# !""
  12. 12. !! "! ℎ! !" "" !# "# ℎ# … !!" !!" !!" !"" !"" !"# !"# !"# $! $" $# ℎ" !$ "$ ℎ$ !!" !"# $$ !"" Backpropagation Through Time (BPTT) Gradient Propagation ● gradient might vanish/explode through many non-linear transformations ● difficult to train on long- term dependencies Bengio et al, “Learning long-term dependencies with gradient descent is difficult”, IEEE Transactions on Neural Networks, 1994 Pascanu et al, “On the difficulty of training recurrent neural networks”, ICML 2013
  13. 13. Approaches ● Gated architectures ○ LSTM, GRU ○ training is slow
  14. 14. Randomization in Deep Neural Networks
  15. 15. Deep Learning Deep Learning models achieved tremendous success over the years. This comes at very high cost in terms of ● Time ● Parameters Do we really need this all the time?
  16. 16. Example: embedded applications Source: https://bitalino.com/en/freestyle-kit-bt Source: https://www.eenewsembedded.com/news/ raspberry-pi-3-now-compute-module-format
  17. 17. Complexity / Accuracy Tradeoff Accuracy Complexity Deep NNs Linear models SVMs-like Deep Randomized NNs
  18. 18. The Philosophy “Randomization is computationally cheaper than optimization” Rahimi, A. and Recht, B., 2008. Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning. Advances in neural information processing systems, 21, pp.1313-1320. Rahimi, A. and Recht, B., 2007. Random features for large-scale kernel machines. Advances in neural information processing systems, 20, pp. 1177-1184.
  19. 19. Randomization = Efficiency ● Training algorithms are cheaper and simpler ● Model transfer: don’t need to transmit all the weights ● Amenable to neuromorphic implementations
  20. 20. Historical note: the cortico-striatal model ● Fixed recurrent connections in the PFC ● Dopamine regulated connections between PFC and neurons in the striatum (CD) Dominey, P.F., 2013. Recurrent temporal networks and language acquisition—from corticostriatal neurophysiology to reservoir computing. Frontiers in psychology, 4, p.500.
  21. 21. Reservoir Computing
  22. 22. Reservoir Computing: focus on the dynamical system input layer reservoir readout fixed trainable Randomly initialized under stability conditions on the dynamical system Stable dynamics - Echo State Property Verstraeten, David, et al. Neural networks 20.3 (2007). Lukoševičius, Mantas, and Herbert Jaeger. Computer Science Review 3.3 (2009). !! = tanh((!)"# + ℎ!$%)##)
  23. 23. Echo State Network Jaeger, Herbert, and Harald Haas. Science 304.5667 (2004): 78-80.
  24. 24. Liquid State Machine Maass, Wolfgang, Thomas Natschläger, and Henry Markram. Neural computation 14.11 (2002): 2531-2560.
  25. 25. Fractal Prediction Machine Tino, Peter, and Georg Dorffner. Machine Learning 45.2 (2001): 187-217.
  26. 26. Echo State Networks (ESNs) Reservoir (! = tanh(,!5"# + (!$%5##) ● large layer of recurrent units ● sparsely connected ● randomly initialized (ESP) ● untrained input layer reservoir readout
  27. 27. Echo State Networks (ESNs) Readout /! = (!-#& ● linear combination of the reservoir state variables ● can be trained in closed form -#& = 7'7 $%7'8 input layer reservoir readout
  28. 28. Reservoir Initialization ● Random but stable ○ the state should not be sensitive to tiny input perturbations ● Control the max singular value of 5##, or ● Control the spectral radius of 5## 9 5## < 1 1. Generate a random matrix and then re-scale its -()##) 2. Generate a random matrix from .(− &' &( , &' &( ) hyper-parameter
  29. 29. Why does it work? Suffix-based Markovian organization of the state space of contraction reservoir mappings even prior to learning. +1 +1 -1 -1 -1 +1 +1 -1 ? Influence on the state Gallicchio, Claudio, and Alessio Micheli. "Architectural and markovian factors of echo state networks." Neural Networks24.5 (2011): 440-456. ESNs exploit the architectural bias of RNNs
  30. 30. Chaotic attractors
  31. 31. Distributed Intelligence Applications Dragone, Mauro, et al. "A cognitive robotic ecology approach to self-configuring and evolving AAL systems." Engineering Applications of Artificial Intelligence 45 (2015): 269-280.
  32. 32. Robot localization in critical environments Dragone, Mauro, et al. ESANN. 2016.
  33. 33. Human Activity Recognition ● Classification of human daily activities from RSS data generated by sensors worn by the user http://archive.ics.uci.edu/ml/datasets/Activity+Recognition+system+based+on+Multisensor+data+fusion+%28AReM%29 Dataset is available online on the UCI repository
  34. 34. Clinical applications ● Automatic assessment of balance skills ● Predict the outcome of the Berg Balance Scale (BBS) clinical test from time-series of pressure sensors Bacciu, Davide, et al. Engineering Applications of Artificial Intelligence 66 (2017): 60-74. oremi Wii Balance Board BBS
  35. 35. https://www.linkedin.com/company/teaching-horizon-2020/ https://twitter.com/TEACHING_H2020 https://www.teaching-h2020.eu
  36. 36. RC in Autonomous Vehicles ● Automatic detection of physiological, emotional, cognitive state of the human à Human-centric personalization ● Good performance in human state monitoring + efficiency D. Bacciu, D. Di Sarli, C. Gallicchio, A. Micheli, N. Puccinelli, “Benchmarking Reservoir and Recurrent Neural Networks for Human State and Activity Recognition”, IWANN 2021
  37. 37. Distributed, embeddable and federated learning
  38. 38. …and Beyond
  39. 39. Physical Reservoir Computing Tanaka, G., Yamane, T., Héroux, J.B., Nakane, R., Kanazawa, N., Takeda, S., Numata, H., Nakano, D. and Hirose, A., 2019. Recent advances in physical reservoir computing: A review. Neural Networks, 115, pp.100-123.
  40. 40. Workshop W1: Deep Learning in Unconventional Neuromorphic Hardware Friday, July 23, 12:30PM-4:30PM, Room: IJCNN Virtual Room 1 https://events.femto-st.fr/DLUNH/en/program
  41. 41. Depth in RNNs shallow deep input deep readout deep reservoir Pascanu, R., Gulcehre, C., Cho, K. and Bengio, Y., 2013. How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026.
  42. 42. Deep Echo State Networks input layer reservoir layer 1 reservoir layer L fixed readout Gallicchio, Claudio, Alessio Micheli, and Luca Pedrelli. "Deep reservoir computing: A critical experimental analysis." Neurocomputing 268 (2017): 87-99 reservoir layer 2 trainable
  43. 43. Multiple time-scales ● Effects of input perturbations last longer in the higher reservoir layers ● Multiple time-scales representation is intrinsic Gallicchio, Claudio, Alessio Micheli, and Luca Pedrelli. "Deep reservoir computing: A critical experimental analysis." Neurocomputing 268 (2017): 87-99 Gallicchio, C. and Micheli, A., 2018, July. Why Layering in Recurrent Neural Networks? A DeepESN Survey. In 2018 International Joint Conference on Neural Networks (IJCNN)(pp. 1-8). IEEE.
  44. 44. Implementation https://github.com/gallicch/DeepRC-TF
  45. 45. Structured data time-series graphs
  46. 46. Neural networks for graphs , { }
  47. 47. Vertex-wise graph encoding ● time-step → vertex ● previous time step → neighborhood v1 v2 v3 v4 v embedding (state) input features embeddings of neighbors
  48. 48. It’s accurate Gallicchio, C. and Micheli, A., 2020. Fast and Deep Graph Neural Networks. In AAAI (pp. 3898-3905).
  49. 49. It’s accurate Gallicchio, C. and Micheli, A., 2020. Fast and Deep Graph Neural Networks. In AAAI (pp. 3898-3905).
  50. 50. It’s fast Gallicchio, C. and Micheli, A., 2020. Fast and Deep Graph Neural Networks. In AAAI (pp. 3898-3905).
  51. 51. Conclusions
  52. 52. Summary ● Reservoir Computing: paradigm for designing and training RNNs ○ fixed hidden recurrent layer (controlled for asymptotic stability) ○ trainable readout layer ● Fast (& simple) training compared to standard RNNs ● Good for sensor data ● Very active area of research… ○ Embedded applications ○ Unconventional Neuromorphic Hardware ○ European Projects
  53. 53. Deep Randomized Neural Networks Gallicchio, C. and Scardapane, S., 2020. Deep Randomized Neural Networks. In Recent Trends in Learning From Data (pp. 43-68). Springer, Cham. https://arxiv.org/pdf/2002.12287.pdf AAAI-21 tutorial website: https://sites.google.com/site/cgallicch/resources/tutorial_DRNN
  54. 54. IEEE Task Force on Randomization-based Neural Networks and Learning Systems Promote the research and applications of deep rand. neural networks and learning systems, to demonstrate the competitive performance of randomization-based algorithms in diverse scenarios, to educate the research community about the randomization-based learning methods and their relationships, https://sites.google.com/view/randnn-tf/
  55. 55. IEEE Task Force on Reservoir Computing Promote and stimulate the development of Reservoir Computing research under both theoretical and application perspectives. https://sites.google.com/view/reservoir-computing-tf/
  56. 56. Reservoir Computing Fast Deep Learning for Sequences Claudio Gallicchio gallicch@di.unipi.it https://www.linkedin.com/in/claudio-gallicchio-05a47038/ https://twitter.com/claudiogallicc1 Thanks for attending!

