3. 1.GPU
• Graphics Processing Unit
• Accelerate scientific and engineering
applications. (Example : 3D Gaming)
Fig1 :NVidia GeForce
8800gt
9/25/2013 3
4. Why ?
• More integrated transistors
• Rising power consumption
• Dissipation of heat , Complex cooling solution,
Nosier fans
• Challenge of developing energy efficient code
• Analyzing and modeling consumption of
runtime GPUs
9/25/2013 4
5. 2. Statistical Power Consumption
Analyzing
• High level methodology to model
• First work that applies statistical analysis to
model the power consumption of a GPU
• Using coupling among power consumption
characteristics , run time performance,
dynamic workloads
9/25/2013 5
6. 2.1 How?
• Record power consumption , run time
workload signals, performance data
• Build a statistical regression model
o Ability to estimate the power consumption of GPU
dynamically
o Bridge the dynamic workload of runtime GPUs
their estimated power consumptions
• Uses NVidia GeForce 8800gt graphics card
9/25/2013 6
7. 2.2 Data Acquisition
• Power consumption data
• GPU Workload Signal Recording
9/25/2013 7
8. 2.2.1 Power Consumption Data
Acquisition
• Test Computer Programs designed to test GPU
o NVidia GeForce 8800gt graphics card with a 200 Watt power specification
o AMD Athlon 64x2 3.0GHz Dual-Core Processor
o 2GB memory
o Corsair TX 750W power supply
• Host Computer Specialized data recording software
,Power acquisition system (FLUKE 2680A)
9/25/2013 8
9. 2.2.2 GPU Workload Signal Processing
• Record using NVidia PerfKit performance
analysis tool simultaneously
- Cable of dynamically extracting 39 GPU
workload variables
• Choose 5 major variables
- Represent the runtime utilizations of major `
pipeline stages on the GPU
• Record GPU workload signals
• Resample GPU workload signals
9/25/2013 9
10. Fig 2: Recorded and Corresponding Resampled Data
Five Major variables
1. vertex_shader_busy (the percentage of time when the vertex shader is busy),
2. pixel_shader_busy (the percentage of time when the pixel shader is busy)
3. texture_busy (the percentage of time when the texture unit is busy)
4. goem_busy (the percentage of time when the geometry shader is busy)
5. rop_busy (the percentage of time when the ROP unit is active)
9/25/2013 10
11. 3. Statistical GPU Power Model
• Assuming,
–Processed power consumption data is Y =
{Yt1, Yt2, …….Ytn} (ti denotes the time
index)
–Aligned GPU workload data is Xj = {XJ
t1 ,
Xj
t2 ,….., Xj
tn } (1 j N, Xj represents
jth GPU workload variable)
9/25/2013 11
12. Constructed a statistical multivariable
function (model)
Yt = F(Xt
1, Xt
2,…..Xt
N)
That can robustly and accurately predict the
GPU power consumption Yt ,given any GPU
workload variables (Xt
1, Xt
2,…..Xt
N).
9/25/2013 12
13. 3.1 Methodology
• 5 major GPU workload variables.
• Split the data set into training subset and a
cross validation subset (test data).
• Used the training subset to learn a Support
Vector Regression model using LIBSVM.
• Compared the cross validation results of the
chosen SVR model with a Simple Least
Square Based Linear Regression (SLR) model.
9/25/2013 13
16. It’s Clear ….!!!!
Regardless of whether graphic computing
or GPGPU applications are used chosen
SVR model measurably performed better
than traditional SLR on the cross
validation data(test data ) set .
9/25/2013 16
17. 4. Evaluation and Validation
• What are the accuracy and the robustness
of the proposed statistical model if the
GPU runs non bench mark programs ?
9/25/2013 17
18. Eight test programs ( 4 graphics programs and
4 GPGPU computing applications) were
selected for the testing.
Graphic Program – Nexuiz
Xmas Tree
HDR
Dual Depth Peeling
Each of the program ran for 100 seconds.
9/25/2013 18
19. GPGPU Programs – GNN
N-body simulation
Option Pricing
Fast Walsh Transform
N-body simulation ran for 20 seconds and other
three stopped automatically once they generated
outputs.
9/25/2013 19
20. Table 2 : Summary of Power Prediction Errors as a percentage of mean GPU Power
Consumption
9/25/2013 20
21. 4.1 Results
Fig 4 : Comparison between the ground truth (blue) and the predicted GPU
power consumption data (red) for the chosen four graphics programs
9/25/2013 21
22. Fig 5 : Comparison between the ground truth (blue) and the predicted GPU
power consumption data (red) for the chosen four GPGPU programs
9/25/2013 22
23. 5. Discussion
1. This research work studied correlation of
power consumption and performance of
graphic applications using NVIDIA Perfkit.
2. NVIDIA Perfkit is designed to identify usage
of GPU components by conventional graphic
applications.
3. It cannot identify GPGPU special events such
as Global Memory Access which has the
largest factor in power consumption of GPU.
9/25/2013 23
24. 4. This model completely depends on the recorded
workload signals of the runtime GPU. But sometimes
it fails to indicate the power consumption of the
underlying GPU.
5. Can not accurately model power consumption peaks.
(due to some other factors as bus communication, or
memory access)
6. It is hard to predict how much training data is
sufficient and will be needed in advance.
9/25/2013 24
25. 5.1 Related Work
• Statistical power modeling of GPU kernels using
performance counters – 2010
• Quantifying the impact of GPUs on performance and
energy efficiency in HPC clusters – 2010
• Performance and Power Analysis of ATI GPU: A
Statistical Approach- 2011
• Tree Structured Analysis on GPU Power Study - 2011
9/25/2013 25
26. 6. References
• H.Nagasaka, N. Maruyama, A. Nukada, S. Matsuoka, T.Endo, Statistical
Power Modeling of GPU Kernels Using Performance Counters, In Proc.
of International Conference on Green Computing, p. 115-122, 2010
• J.Chen, B. Li, Y.Shang, L. Peng, J.Pier, Tree Structured Power Analysis on
GPU Power Study, In Proc. Of 29th International Conference on Computer
Design, p.57-64, 2011
• Y.Zhang, Y.Hu, B.Li, L.Pen, Performance and Power Analysis of ATI GPU
: Statistical Approach, In Proc. of 6th IEEE International Conference on
NAS, p. 149-158,2011
• R.Suda, D.Q. Ren, Accurate Measurements and Precise Modeling of
Power Dissipation of CUDA Kernels toward Power Optimized High
Performance CPU-GPU Computing , In Proc. of International Conference
on Parallel and Distributed Computing, Application and Development, p.
432-438, 2009
9/25/2013 26