Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GPU Computing with Ruby

7,079 views

Published on

Presented in pecha kucha sg, a follow up party of RedDotRubyConf 2011.

Published in: Technology, Education
  • Be the first to comment

GPU Computing with Ruby

  1. 1. GPU Computing with Ruby SpeedGo Computing Chung Shin Yee
  2. 2. CPU vs GPU Architecture 6 Core vs 1024 Core6 GB/s vs 300 GB/s Memory Bandwidth By CUDA C Programming Guide
  3. 3. CUDA Programming Model . . . .By CUDA C Programming Guide
  4. 4. Existing Programming Tools● Cg● BrookGPU● GLSL (OpenGL Shading Language)● Nvidia CUDA C/C++● OpenCL● PyCUDA Where is the Red Ruby ?
  5. 5. Bridging Ruby & CUDA C/C++● Ruby C extension – Hard to manipulate Ruby objects in C. – Compilation problems.● Ruby FFI – Bridging purely in Ruby. – Support multiple Ruby implementations.
  6. 6. Ruby Bridge Sample
  7. 7. Developing SGC Ruby CUDA● Object-oriented API.● Start with crucial operations. – Memory allocation. – Memory transfer. – Kernel launch. – Wrapper for structures.● Documented with YARD.
  8. 8. Driver vs Runtime API● CUDA Driver API – For system developers. – Supported by PyCUDA.● CUDA Runtime API – For computation centric developers. We going to support both API !
  9. 9. Using SGC Ruby CUDA● Kernel program in CUDA C.
  10. 10. Using SGC Ruby CUDA● Compiling kernel into PTX. – nvcc --ptx vadd.cu
  11. 11. Using SGC Ruby CUDA● Setup require rubycu include SGC::CU CUInit.init d = CUDevice.get(0) c = CUContext.create(d) m = CUModule.new.load(“vadd.ptx”) f = m.function(“vadd”)
  12. 12. Using SGC Ruby CUDA● Memory allocations da = CUDevice.malloc(10*4) db = CUDevice.malloc(10*4) dc = CUDevice.malloc(10*4) ha = Buffer.new(:int, 10) hb = Buffer.new(:int, 10) hc = Buffer.new(:int, 10)
  13. 13. Using SGC Ruby CUDA● Initialization (0...10).each { |i| ha[i] = i hb[i] = 1 hc[i] = ha[i] + hb[i] hd[i] = 0 }
  14. 14. Using SGC Ruby CUDA● Transfer inputs to the GPU CUMemory.memcpy_htod(da, ha, 4*10) CUMemory.memcpy_htod(db, hb, 4*10) CUMemory.memcpy_htod(dc, hc, 4*10)
  15. 15. Using SGC Ruby CUDA● Launch kernel on GPU # Launch with 1x1x1 grid, # 10x1x1 blocks, params = [da, db, dc, 10] f.launch_kernel(1, 1, 1, 10, 1, 1, 0, 0, params) By CUDA C Programming Guide By CUDA C Programming Guide
  16. 16. Using SGC Ruby CUDA● Transfer results back to system memory CUMemory.memcpy_dtoh(hd, dc, 4*10)● Verify results (0...10).each { |i| assert_equal(hc[i], hd[i]) }
  17. 17. Problematic CUDA Runtime API● For use in a CUDA C/C++ program.● Workaround – CUDA C/C++ effectively uses C/C++ bindings. – Create dynamic library for the kernel programs. – Load the library at runtime.
  18. 18. Current Limitations● Support limited data types. – Fixnum → int – ?? → long – Float → float – ?? → double● No supports for CUDA C++ templates.● No Ruby in a kernel program.
  19. 19. To Support● Texture memory.● New features in CUDA 4.0 – Multi-GPU. – Unified Virtual Memory.● More C data types.● Mac platform.
  20. 20. Try It Now! Thank You ~git clone git://github.com/xman/sgc-ruby-cuda.gitcd sgc-ruby-cudagem install ffi yardrake testrake yard

×