TensorFlow has been implemented on Qualcomm's low-power DSP to run neural networks for machine learning. Models run 8 times faster and use 1.4 watts of power on the Snapdragon 820 processor's Hexagon DSP compared to 5 watts on a CPU. Qualcomm tested gemmlowp on the HVX and achieved over 5 times the speed and much lower power usage than a CPU. End to end performance on InceptionV1 was around 90 ms using the HVX versus 700 ms on a CPU.