2017 04-13-google-tpu-04

1
Dr HAMADI CHAREF Brahim
Data Storage Institute (DSI)
Agency for Science, Technology and Research (A*STAR)
April 13, 2017

2
 ISCA 2017 Paper
 Motivations
 Photos
 Internals
 Performance
 Summary
 Related Patents
Agenda

3https://drive.google.com/file/d/0Bx4hafXDDq2EMzRNcy1vSUxtcEk/view
Paper on Google TPUTM

4
 Google Data Center demand / workload
 Neural Networks inference 95%
- Multi-Layer Perceptrons (MLPs)
- Convolutional Neural Networks (CNNs)
- Long Short-Term Memory Units (LSTMs)
 Performance Metrics
- Peak throughput (92 TeraOps/second)
- Power (Watts)
- Cost ($$$)
Google TPU - Motivations

5
 CPU vs GPU vs (FPGA vs) ASIC
CPU - server-class Intel Haswell
GPU - Nvidia K80
ASIC – Tensor Processing Unit
Google TPU - Motivations

15
Google TPU – Performance (roofline)
Roofline: an insightful visual performance model for multicore architectures
Samuel Williams, Andrew Waterman, David Patterson
Communications of the ACM - A Direct Path to Dependable Software: Volume 52 Issue 4, April 2009

16
 TPU is an ASIC for NNets
 BIG Matrix Unit
256x256 8b = 65,536 MACs (32b ACC)
 TPU on average 15X - 30X faster than GPU or CPU
 TOPS/Watt about 30X - 80X higher
 Future TPU could use GDDR5 memory (as GPU)
- triple achieved TOPS
- raise TOPS/Watt to nearly 70X the GPU
- raise TOPS/Watt to nearly 200X the CPU
Google TPU - Summary

17
 Vector Computation Unit in a Neural Network Processor
Gregory Michael Thorson, Christopher Aaron Clark, Dan Luu.
https://www.google.com/patents/US20160342889
 Batch Processing in a Neural Network Processor
Reginald Clifford Young
 Neural Network Processor
Jonathan Ross, Norman Paul Jouppi, Andrew Everett Phelps, Reginald
Clifford Young, Thomas Norrie, Gregory Michael Thorson, Dan Luu.
System and method for parallelizing convolutional neural networks
Alexander Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton
Google TPU - Patents

18
 Computing Convolutions Using a Neural Network Processor
Jonathan Ross, Andrew Everett Phelps.
https://www.google.com/patents/WO2016186811A1
 Prefetching Weights for a Neural Network Processor
Jonathan Ross.
 Rotating Data for Neural Network Computations
Jonathan Ross, Gregory Michael Thorson.
http://google.com/patents/US20160342893
Google TPU - Patents

2017 04-13-google-tpu-04

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to 2017 04-13-google-tpu-04

Similar to 2017 04-13-google-tpu-04 (20)

Recently uploaded

Recently uploaded (20)

2017 04-13-google-tpu-04