This work dates back to 2011 when presented the most optimized ASIC architecture for matrix multiplication inner kernel Read less