19. But the buffer is not big enough for
entire frame!
http://blog.imgtec.com/powervr/a-look-at-the-powervr-graphics-architecture-tile-based-rendering
Tiling
20. Need to solve….
the rendering sequence
http://blog.imgtec.com/powervr/a-look-at-the-powervr-graphics-architecture-tile-based-rendering
Tiling
36. CPU vs GPU : Architecture
https://www.classicspecification.com.ng/the-difference-between-cpu-and-gpu/
Latency Driven Throughput Driven
37. Different Design Ideas
● GPU are not running multiple things:
– Not compiling 100 files
– Serving 10000 users on web backend
● But a large things:
– searching 1M patches on a image
– calculation of matrix multiplication of
1000x1000
39. Throughput is not Free : Bandwidth
Main Mem GPU Mem Main Mem
Bandwidth
GPU
Bandwidth
PC 16GB DDR4 8GB DDR5 or
8GB DDR5X
34 GB/s 224 GB/s
320 GB/s
Mobile 4GB LPDDR4 Shared 17 GB/s 17 GB/s
LPDDR4 : 28.8GB/s
Q S820 LPDDR4: 17.4GB/s
GTX 980 DDR5 : 224GB/s
GTX 1080 DDR5X : 320GB/s
http://www.geforce.com.tw/hardware/desktop-gpus/geforce-gtx-980/specifications
http://www.geforce.com/hardware/10series/geforce-gtx-1080
Intel i7 6700HQ max bandwidth 34.1 GB/s
40. PC vs Mobile : Architecture
Main Mem GPU Mem Main Mem /
GPU
Bandwidth
Cooling Chip + RAM Price
PC 16GB DDR4 8GB DDR5X 34 GB/s
224 GB/s
Air/Liquid USD 1000.-
Mobile 4GB LPDDR4 Shared 17 GB/s
17 GB/s
None USD 100.-
41. PC vs Mobile : Performance
Performance
CPU:GPU:ASIC
Power
Efficiency
GFLOPS/J
PC 1 : 100 : ? ?
Mobile 1 : 10 : ? ?
*single CPU thread.
52. New Memory / Storage to address
bandwidth issue
● 3D XPoint
● DDR5X
● HBM2
53. From Batch to Realtime
● Realtime raytracing
● Interactive weather forecasting
● Map reconstruction after earthquake
● Realtime machine learning
● Face-morphing in the video
conference