Deep Learning Accelerator Design Techniques

Design Techniques
for DLA
* DLA stands for Deep Learning Accelerator

(draft)

https://en.wikipedia.org/wiki/The_Boss_Baby

Dark Silicon
https://www.publicdomainpictures.net/en/view-image.php?image=44607&picture=portrait-of-the-dark-sides-man

Convolution
• computation intensive

• Around 1x1xC ~ 11x11xC

• Variant

• depthwise Separable Convolution

• sparse

•

Fully-Connected
• Most weights

Hardware Accelerator Design for Machine Learning, 2016

* A Reconﬁgurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things

For Larger Convolution Kernels

pruned and retrained

pruned and retrained
Deep compression: Compressing DNNs with pruning, trained quantization and huﬀman coding, 2015

Tensor Core
"SIMD" for GPU
https://devblogs.nvidia.com/programming-tensor-cores-cuda-9/

General Tricks
• Burst/fetch continues blocks

•

Analog computing

Thermal
Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era, 2016

Memory Bandwidth
Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era, 2016

• most of the energy is consumed not in computation but in
moving data to and from memory

• moving from 16- to 64-bit fetches only changes the
energy by 1.5x

•

• Embedded Binarized Neural Networks

Deep Learning Accelerator Design Techniques

More Related Content

What's hot

Similar to Deep Learning Accelerator Design Techniques

More from Mindos Cheng

Recently uploaded

Deep Learning Accelerator Design Techniques