Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Statistical power consumption analysis and modeling


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Statistical power consumption analysis and modeling

  1. 1. Statistical Power Consumption Analysis and Modeling for GPU Based Computing By Xiaohan Ma, Mian Dong, Lin Zhong and Zhigang Deng 9/25/2013 1
  2. 2. Content 1.GPU 2.Statistical Power Consumption Analyzing 3.Statistical GPU Power Model 4.Evaluation and Validation 5.Discussion 6.References 9/25/2013 2
  3. 3. 1.GPU • Graphics Processing Unit • Accelerate scientific and engineering applications. (Example : 3D Gaming) Fig1 :NVidia GeForce 8800gt 9/25/2013 3
  4. 4. Why ? • More integrated transistors • Rising power consumption • Dissipation of heat , Complex cooling solution, Nosier fans • Challenge of developing energy efficient code • Analyzing and modeling consumption of runtime GPUs 9/25/2013 4
  5. 5. 2. Statistical Power Consumption Analyzing • High level methodology to model • First work that applies statistical analysis to model the power consumption of a GPU • Using coupling among power consumption characteristics , run time performance, dynamic workloads 9/25/2013 5
  6. 6. 2.1 How? • Record power consumption , run time workload signals, performance data • Build a statistical regression model o Ability to estimate the power consumption of GPU dynamically o Bridge the dynamic workload of runtime GPUs their estimated power consumptions • Uses NVidia GeForce 8800gt graphics card 9/25/2013 6
  7. 7. 2.2 Data Acquisition • Power consumption data • GPU Workload Signal Recording 9/25/2013 7
  8. 8. 2.2.1 Power Consumption Data Acquisition • Test Computer  Programs designed to test GPU o NVidia GeForce 8800gt graphics card with a 200 Watt power specification o AMD Athlon 64x2 3.0GHz Dual-Core Processor o 2GB memory o Corsair TX 750W power supply • Host Computer Specialized data recording software ,Power acquisition system (FLUKE 2680A) 9/25/2013 8
  9. 9. 2.2.2 GPU Workload Signal Processing • Record using NVidia PerfKit performance analysis tool simultaneously - Cable of dynamically extracting 39 GPU workload variables • Choose 5 major variables - Represent the runtime utilizations of major ` pipeline stages on the GPU • Record GPU workload signals • Resample GPU workload signals 9/25/2013 9
  10. 10. Fig 2: Recorded and Corresponding Resampled Data Five Major variables 1. vertex_shader_busy (the percentage of time when the vertex shader is busy), 2. pixel_shader_busy (the percentage of time when the pixel shader is busy) 3. texture_busy (the percentage of time when the texture unit is busy) 4. goem_busy (the percentage of time when the geometry shader is busy) 5. rop_busy (the percentage of time when the ROP unit is active) 9/25/2013 10
  11. 11. 3. Statistical GPU Power Model • Assuming, –Processed power consumption data is Y = {Yt1, Yt2, …….Ytn} (ti denotes the time index) –Aligned GPU workload data is Xj = {XJ t1 , Xj t2 ,….., Xj tn } (1 j N, Xj represents jth GPU workload variable) 9/25/2013 11
  12. 12. Constructed a statistical multivariable function (model) Yt = F(Xt 1, Xt 2,…..Xt N) That can robustly and accurately predict the GPU power consumption Yt ,given any GPU workload variables (Xt 1, Xt 2,…..Xt N). 9/25/2013 12
  13. 13. 3.1 Methodology • 5 major GPU workload variables. • Split the data set into training subset and a cross validation subset (test data). • Used the training subset to learn a Support Vector Regression model using LIBSVM. • Compared the cross validation results of the chosen SVR model with a Simple Least Square Based Linear Regression (SLR) model. 9/25/2013 13
  14. 14. Graphic Program GPGPU Jorik benchmark Fig 3 :Cross Validation Comparison Result 9/25/2013 14
  15. 15. Open GL Geometry Benchmark 1.0 (Graphic Program) GPGPU Jorik Benchmark SLR 656.83 44.523 SVR 589.73 39.427 Sum Square Error Comparison Between SLR and SVR for Cross Validation Data Table 1 : Sum Square Error Comparison Results between SLR and SVR9/25/2013 15
  16. 16. It’s Clear ….!!!! Regardless of whether graphic computing or GPGPU applications are used chosen SVR model measurably performed better than traditional SLR on the cross validation data(test data ) set . 9/25/2013 16
  17. 17. 4. Evaluation and Validation • What are the accuracy and the robustness of the proposed statistical model if the GPU runs non bench mark programs ? 9/25/2013 17
  18. 18. Eight test programs ( 4 graphics programs and 4 GPGPU computing applications) were selected for the testing. Graphic Program – Nexuiz Xmas Tree HDR Dual Depth Peeling Each of the program ran for 100 seconds. 9/25/2013 18
  19. 19. GPGPU Programs – GNN N-body simulation Option Pricing Fast Walsh Transform N-body simulation ran for 20 seconds and other three stopped automatically once they generated outputs. 9/25/2013 19
  20. 20. Table 2 : Summary of Power Prediction Errors as a percentage of mean GPU Power Consumption 9/25/2013 20
  21. 21. 4.1 Results Fig 4 : Comparison between the ground truth (blue) and the predicted GPU power consumption data (red) for the chosen four graphics programs 9/25/2013 21
  22. 22. Fig 5 : Comparison between the ground truth (blue) and the predicted GPU power consumption data (red) for the chosen four GPGPU programs 9/25/2013 22
  23. 23. 5. Discussion 1. This research work studied correlation of power consumption and performance of graphic applications using NVIDIA Perfkit. 2. NVIDIA Perfkit is designed to identify usage of GPU components by conventional graphic applications. 3. It cannot identify GPGPU special events such as Global Memory Access which has the largest factor in power consumption of GPU. 9/25/2013 23
  24. 24. 4. This model completely depends on the recorded workload signals of the runtime GPU. But sometimes it fails to indicate the power consumption of the underlying GPU. 5. Can not accurately model power consumption peaks. (due to some other factors as bus communication, or memory access) 6. It is hard to predict how much training data is sufficient and will be needed in advance. 9/25/2013 24
  25. 25. 5.1 Related Work • Statistical power modeling of GPU kernels using performance counters – 2010 • Quantifying the impact of GPUs on performance and energy efficiency in HPC clusters – 2010 • Performance and Power Analysis of ATI GPU: A Statistical Approach- 2011 • Tree Structured Analysis on GPU Power Study - 2011 9/25/2013 25
  26. 26. 6. References • H.Nagasaka, N. Maruyama, A. Nukada, S. Matsuoka, T.Endo, Statistical Power Modeling of GPU Kernels Using Performance Counters, In Proc. of International Conference on Green Computing, p. 115-122, 2010 • J.Chen, B. Li, Y.Shang, L. Peng, J.Pier, Tree Structured Power Analysis on GPU Power Study, In Proc. Of 29th International Conference on Computer Design, p.57-64, 2011 • Y.Zhang, Y.Hu, B.Li, L.Pen, Performance and Power Analysis of ATI GPU : Statistical Approach, In Proc. of 6th IEEE International Conference on NAS, p. 149-158,2011 • R.Suda, D.Q. Ren, Accurate Measurements and Precise Modeling of Power Dissipation of CUDA Kernels toward Power Optimized High Performance CPU-GPU Computing , In Proc. of International Conference on Parallel and Distributed Computing, Application and Development, p. 432-438, 2009 9/25/2013 26
  27. 27. Thank You 9/25/2013 27