Successfully reported this slideshow.
Your SlideShare is downloading. ×

Machine learning on 1 cm2 - Tweakers Dev Summit

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Machine learning on 1 cm2 - Tweakers Dev Summit

  1. 1. Machine learning on one one square centimeter Jan Jongboom, Arm Tweakers Dev Summit 14 February 2019
  2. 2. Jan Jongboom Principal Developer Evangelist jan.jongboom@arm.com
  3. 3. 3 8 years ago...
  4. 4. http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/ibm700series/impacts/
  5. 5. https://cdn2.i-scmp.com/sites/default/files/styles/980x551/public/images/methode/2017/05/23/6660b96e-3f9d-11e7-8c27-b06d81bc1bba_1280x720_183924.JPG?itok=ZmONr2a_
  6. 6. 9 Machine learning
  7. 7. 10 Downsides I'm not rich enough to develop 5,000 custom processors Centralized Costs lots of power and bandwidth Privacy Not me
  8. 8. 11 Microcontrollers Small (1cm²) Cheap (~1$) Efficient (years on ba=ery) Slow (max. 100 MHz) Limited memory (max. 256K RAM) Downsides 8 cm
  9. 9. 12 Reinforcement learning X X O O Content of lucifer box Every box resembles a state of the board
  10. 10. X 13 Reinforcement learning X X O O Content of lucifer box Rules Lose: Remove bead Draw: Place 1 bead back Win: Place 3 beads back Reward function
  11. 11. X 14 Reinforcement learning X X O O Content of lucifer box O
  12. 12. X 15 Reinforcement learning X X O O Content of lucifer box
  13. 13. 16 Reinforcement learning X X O O Content of lucifer box
  14. 14. 17 Training vs. classification Hundreds of different states Need to encounter states many times Training takes long! Classification is however simple Play the game, and you have to open up max. 4 drawers!
  15. 15. 18 Machine learning on the edge Typically only classification Typically more efficient than sending data over the network Size of the network still matters
  16. 16. 19 Enabling new use cases Sensor fusion http://www.gierad.com/projects/supersensor/
  17. 17. 21 Federated learning https://research.googleblog.com/2017/04/federated-learning-collaborative.html Enabling new use cases
  18. 18. 22 LPWANs http://ufldl.stanford.edu/tutorial/unsupervised/Autoencoders/ Enabling new use cases
  19. 19. 23 Offline self-contained systems https://os.mbed.com/blog/entry/streaming-data-cows-dsa2017/ Enabling new use cases
  20. 20. 24 Practical example https://www.youtube.com/watch?v=FhbCAd0sO1c
  21. 21. 25 Training... MNIST data set Training set: 60,000 images Every drawing is downsampled to 28x28 pixels Supervised learning through backpropagation https://blog.hackster.io/simple-neural-network-on-mcus-a7cbd3dc108c
  22. 22. 26 Neural networks are made up of neurons Input Input Input Input Output Activation function fn
  23. 23. 27 Neural networks are made up of neurons 3 2 5 7 1sum > 10 Connections
  24. 24. 28 Neural networks are made up of neurons 3 2 5 7 0sum > 10 0.3 1.0 0.1 0.6
  25. 25. 29 Neural networks are made up of layers of neurons 28x28 = 784 9 0 1 2 9 0 1 1 0 1 Before training: weights are random, we adjust them during training
  26. 26. 30 After training https://vas3k.com/blog/machine_learning/
  27. 27. 31 Things at play Number of connections ((784 x 128) + (128 x 64) + (64 x 10) + (10 x 1)) = 109,194 Connections have weights, typically 4 bytes Run activation function for every neuron (784 + 128 + 64 + 10)
  28. 28. 32 Optimizations Number of connections ((784 x 128) + (128 x 64) + (64 x 10) + (10 x 1)) = 109,194 Store connections in ROM or external flash, page into RAM when needed Connections have weights, typically 4 bytes Quantization to 1 byte Run activation function for every neuron (784 + 128 + 64 + 10) Optimized DSP instructions (CMSIS-NN)
  29. 29. 33 In practice...
  30. 30. 34 Keyword spotting Real-time keyword detection "Yes", "No", "Left", "Right" ... 10 inferences per second (216 MHz) 70 KB RAM used https://github.com/ARM-software/ML-KWS-for-MCU
  31. 31. 35 Object detection Object detection in video using CIFAR10 32x32 input in color, 10 classes to detect 10 images per second (216 MHz) 133 KB RAM used https://github.com/ARM-software/ML-examples/tree/master/cmsisnn-cifar10
  32. 32. 37 Battery life 1 coin cell 1 image every 5 minutes 100 ms. per inference > 1 year of battery life
  33. 33. 38 Getting started...
  34. 34. 39 Getting started 1. Buy a development board 2. Run some tutorials: https://blog.hackster.io/simple-neural-network-on-mcus- a7cbd3dc108c https://github.com/uTensor/ADL_Demo https://github.com/ARM-software/ML-examples/tree/master/ cmsisnn-cifar10 https://github.com/ARM-software/ML-KWS-for-MCU
  35. 35. 40 Recap 1. Machine learning is cool 2. Machine learning on the edge is even cooler 3. We're just getting started, so get hacking 4. ??? 5. PROFIT!!!
  36. 36. 41 https://labs.mbed.com Slides: http://janjongboom.com Thank you!

×