The document discusses the motivation for developing the Tensor Processing Unit (TPU), which was that DNN-based workloads were consuming a large and growing portion of datacenter compute resources. It describes how the TPU was developed by Norman Jouppi and others at Google to be much more efficient than CPUs and GPUs for DNN workloads, with up to 80x higher performance per watt. It provides details on the TPU architecture and experimental results showing it significantly outperformed GPUs on latency for DNN inference tasks.