CSTalks - GPGPU - 19 Jan

941 views

Published on

First talk, GPGPby Tung

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
941
On SlideShare
0
From Embeds
0
Number of Embeds
42
Actions
Shares
0
Downloads
16
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

CSTalks - GPGPU - 19 Jan

  1. 1. Research in GPU Computing Cao Thanh Tung
  2. 2. Outline ● Introduction to GPU Computing – Past: Graphics Processing and GPGPU – Present: CUDA and OpenCL – A bit on the architecture ● Why GPU? ● GPU v.s. Multi-core and Distributed ● Open problems. ● Where does this go?19-Jan-2011 Computing Students talk 2
  3. 3. Introduction to GPU Computing ● Who have access to 1,000 processors?19-Jan-2011 Computing Students talk 3
  4. 4. Introduction to GPU Computing ● Who have access to 1,000 processors?19-Jan-2011 Computing Students talk 4
  5. 5. Introduction to GPU Computing ● Who have access to 1,000 processors? YOU19-Jan-2011 Computing Students talk 5
  6. 6. Introduction to GPU Computing ● In the past – GPU = Graphics Processing Unit19-Jan-2011 Computing Students talk 6
  7. 7. Introduction to GPU Computing ● In the past – GPU = Graphics Processing Unit19-Jan-2011 Computing Students talk 7
  8. 8. Introduction to GPU Computing ● In the past – GPU = Graphics Processing Unit19-Jan-2011 Computing Students talk 8
  9. 9. Introduction to GPU Computing ● In the past – GPU = Graphics Processing Unit19-Jan-2011 Computing Students talk 9
  10. 10. Introduction to GPU Computing ● In the past – GPU = Graphics Processing Unit19-Jan-2011 Computing Students talk 10
  11. 11. Introduction to GPU Computing ● In the past – GPGPU = General Purpose computation using GPUs19-Jan-2011 Computing Students talk 11
  12. 12. Introduction to GPU Computing ● Now al Gener – GPU = Graphics Processing Unit __device__ float3 collideCell(int3 gridPos, uint index... { uint gridHash = calcGridHash(gridPos); ... for(uint j=startIndex; j<endIndex; j++) { if (j != index) { ... force += collideSpheres(...); } } return force; }19-Jan-2011 Computing Students talk 12
  13. 13. Introduction to GPU Computing ● Now – We have CUDA (NVIDIA, proprietary) and OpenCL (open standard) __device__ float3 collideCell(int3 gridPos, uint index... { uint gridHash = calcGridHash(gridPos); ... for(uint j=startIndex; j<endIndex; j++) { if (j != index) { ... force += collideSpheres(...); } } return force; }19-Jan-2011 Computing Students talk 13
  14. 14. Introduction to GPU Computing ● A (just a little) bit on the architecture of the latest NVIDIA GPU (Fermi) – Very simple core (even simpler than the Intel Atom) – Little cache19-Jan-2011 Computing Students talk 14
  15. 15. Why GPU?19-Jan-2011 Computing Students talk 15
  16. 16. Why GPU? ● Performance19-Jan-2011 Computing Students talk 16
  17. 17. Why GPU? ● People have used it, and it works. – Bio-Informatics – Finance – Fluid Dynamics – Data-mining – Computer Vision – Medical Imaging – Numerical Analytics19-Jan-2011 Computing Students talk 17
  18. 18. Why GPU? ● A new, promising area – Fast growing – Ubiquitous – New paradigm → new problems, new challenges19-Jan-2011 Computing Students talk 18
  19. 19. GPU v.s. Multi-core ● A lot more threads of computation are required: – The GPU has a lot more “core” than a multi-core CPU. – A GPU core is no where as powerful as a CPU core.19-Jan-2011 Computing Students talk 19
  20. 20. GPU v.s. Multi-core ● Challenges: – Not all problems can easily be broken into many small sub- problems to be solved in parallel. – Race conditions are much more serious. – Atomic operations are still doable, locking is a performance killer. Lock-free algorithms are much preferable. – Memory access bottleneck (memory is not that parallel) – Debugging is a nightmare.19-Jan-2011 Computing Students talk 20
  21. 21. GPU v.s. Distributed ● GPU allows much cheaper communication between different threads. ● GPU memory is still limited compared to a distributed system. ● GPU cores are not completely independent processors – Need fine-grain parallelism – Reaching the scalability of a distributed system is difficult.19-Jan-2011 Computing Students talk 21
  22. 22. Open problems ● Data-structures ● Algorithms ● Tools ● Theory19-Jan-2011 Computing Students talk 22
  23. 23. Open problems ● Data-structures – Requirement: Able to handle very high level of concurrent access. – Common data-structures like dynamic arrays, priority queues or hash tables are not very suitable for the GPU. – Some existing works: kD-tree, quad-tree, read-only hash table...19-Jan-2011 Computing Students talk 23
  24. 24. Open problems ● Algorithms – Most sequential algorithms need serious re-design to make good use of such a huge number of cores. ● Our computational geometry research: use the discrete space computation to approximate the continuous space result. – Traditional parallel algorithms may or may not work. ● Usual assumption: infinite number of processors ● No serious study on this so far!19-Jan-2011 Computing Students talk 24
  25. 25. Open problems ● Tools – Programming language: Better language or model to express parallel algorithms? – Compiler: Optimize GPU code? Auto-parallelization? ● Theres some work on OpenMP to CUDA. – Debugging tool? Maybe a whole new “art of debugging” is needed. – Software engineering is currently far behind the hardware development.19-Jan-2011 Computing Students talk 25
  26. 26. Open problems ● Theory – Some traditional approach: ● PRAM: CRCW, EREW. Too general. ● SIMD: Too restricted. – Big Oh analysis may not be good enough. ● Time complexity is relevant, but work complexity is more important. ● Most GPU computing works only talk about actual running time. – Performance Modeling for GPU, anyone?19-Jan-2011 Computing Students talk 26
  27. 27. Where does this go? ● Intel/AMD already have 6 core 12 threads processors (maybe more). ● SeaMicro has a server with 512 Atom dual-core processors. ● AMD Fusion: CPU + GPU. ● The GPU may not stay forever, but massively-multithreaded is definitely the future of computing.19-Jan-2011 Computing Students talk 27
  28. 28. Where to start? ● Check your PC. – If its not at the age of being able to go to a Primary school, theres a high chance it has a GPU. ● Go to NVIDIA/ATI website, download some development toolkit, and youre ready to go.19-Jan-2011 Computing Students talk 28
  29. 29. THANK YOU ● Any questions? Just ask. ● Any suggestion? What are you waiting for. ● Any problem or solution to discuss? Lets have a private talk somewhere (j/k)19-Jan-2011 Computing Students talk 29

×