Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- [Harvard CS264] 06 - CUDA Ninja Tri... by npinto 4192 views
- Gpgpu intro by Dominik Seifert 395 views
- PT-4057, Automated CUDA-to-OpenCL™ ... by AMD Developer Cen... 4756 views
- General Programming on the GPU - Co... by SirKetchup 9916 views
- Newbie’s guide to_the_gpgpu_universe by Ofer Rosenberg 1320 views
- Cliff sugerman by clifford sugerman 546 views

No Downloads

Total views

941

On SlideShare

0

From Embeds

0

Number of Embeds

42

Shares

0

Downloads

16

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Research in GPU Computing Cao Thanh Tung
- 2. Outline ● Introduction to GPU Computing – Past: Graphics Processing and GPGPU – Present: CUDA and OpenCL – A bit on the architecture ● Why GPU? ● GPU v.s. Multi-core and Distributed ● Open problems. ● Where does this go?19-Jan-2011 Computing Students talk 2
- 3. Introduction to GPU Computing ● Who have access to 1,000 processors?19-Jan-2011 Computing Students talk 3
- 4. Introduction to GPU Computing ● Who have access to 1,000 processors?19-Jan-2011 Computing Students talk 4
- 5. Introduction to GPU Computing ● Who have access to 1,000 processors? YOU19-Jan-2011 Computing Students talk 5
- 6. Introduction to GPU Computing ● In the past – GPU = Graphics Processing Unit19-Jan-2011 Computing Students talk 6
- 7. Introduction to GPU Computing ● In the past – GPU = Graphics Processing Unit19-Jan-2011 Computing Students talk 7
- 8. Introduction to GPU Computing ● In the past – GPU = Graphics Processing Unit19-Jan-2011 Computing Students talk 8
- 9. Introduction to GPU Computing ● In the past – GPU = Graphics Processing Unit19-Jan-2011 Computing Students talk 9
- 10. Introduction to GPU Computing ● In the past – GPU = Graphics Processing Unit19-Jan-2011 Computing Students talk 10
- 11. Introduction to GPU Computing ● In the past – GPGPU = General Purpose computation using GPUs19-Jan-2011 Computing Students talk 11
- 12. Introduction to GPU Computing ● Now al Gener – GPU = Graphics Processing Unit __device__ float3 collideCell(int3 gridPos, uint index... { uint gridHash = calcGridHash(gridPos); ... for(uint j=startIndex; j<endIndex; j++) { if (j != index) { ... force += collideSpheres(...); } } return force; }19-Jan-2011 Computing Students talk 12
- 13. Introduction to GPU Computing ● Now – We have CUDA (NVIDIA, proprietary) and OpenCL (open standard) __device__ float3 collideCell(int3 gridPos, uint index... { uint gridHash = calcGridHash(gridPos); ... for(uint j=startIndex; j<endIndex; j++) { if (j != index) { ... force += collideSpheres(...); } } return force; }19-Jan-2011 Computing Students talk 13
- 14. Introduction to GPU Computing ● A (just a little) bit on the architecture of the latest NVIDIA GPU (Fermi) – Very simple core (even simpler than the Intel Atom) – Little cache19-Jan-2011 Computing Students talk 14
- 15. Why GPU?19-Jan-2011 Computing Students talk 15
- 16. Why GPU? ● Performance19-Jan-2011 Computing Students talk 16
- 17. Why GPU? ● People have used it, and it works. – Bio-Informatics – Finance – Fluid Dynamics – Data-mining – Computer Vision – Medical Imaging – Numerical Analytics19-Jan-2011 Computing Students talk 17
- 18. Why GPU? ● A new, promising area – Fast growing – Ubiquitous – New paradigm → new problems, new challenges19-Jan-2011 Computing Students talk 18
- 19. GPU v.s. Multi-core ● A lot more threads of computation are required: – The GPU has a lot more “core” than a multi-core CPU. – A GPU core is no where as powerful as a CPU core.19-Jan-2011 Computing Students talk 19
- 20. GPU v.s. Multi-core ● Challenges: – Not all problems can easily be broken into many small sub- problems to be solved in parallel. – Race conditions are much more serious. – Atomic operations are still doable, locking is a performance killer. Lock-free algorithms are much preferable. – Memory access bottleneck (memory is not that parallel) – Debugging is a nightmare.19-Jan-2011 Computing Students talk 20
- 21. GPU v.s. Distributed ● GPU allows much cheaper communication between different threads. ● GPU memory is still limited compared to a distributed system. ● GPU cores are not completely independent processors – Need fine-grain parallelism – Reaching the scalability of a distributed system is difficult.19-Jan-2011 Computing Students talk 21
- 22. Open problems ● Data-structures ● Algorithms ● Tools ● Theory19-Jan-2011 Computing Students talk 22
- 23. Open problems ● Data-structures – Requirement: Able to handle very high level of concurrent access. – Common data-structures like dynamic arrays, priority queues or hash tables are not very suitable for the GPU. – Some existing works: kD-tree, quad-tree, read-only hash table...19-Jan-2011 Computing Students talk 23
- 24. Open problems ● Algorithms – Most sequential algorithms need serious re-design to make good use of such a huge number of cores. ● Our computational geometry research: use the discrete space computation to approximate the continuous space result. – Traditional parallel algorithms may or may not work. ● Usual assumption: infinite number of processors ● No serious study on this so far!19-Jan-2011 Computing Students talk 24
- 25. Open problems ● Tools – Programming language: Better language or model to express parallel algorithms? – Compiler: Optimize GPU code? Auto-parallelization? ● Theres some work on OpenMP to CUDA. – Debugging tool? Maybe a whole new “art of debugging” is needed. – Software engineering is currently far behind the hardware development.19-Jan-2011 Computing Students talk 25
- 26. Open problems ● Theory – Some traditional approach: ● PRAM: CRCW, EREW. Too general. ● SIMD: Too restricted. – Big Oh analysis may not be good enough. ● Time complexity is relevant, but work complexity is more important. ● Most GPU computing works only talk about actual running time. – Performance Modeling for GPU, anyone?19-Jan-2011 Computing Students talk 26
- 27. Where does this go? ● Intel/AMD already have 6 core 12 threads processors (maybe more). ● SeaMicro has a server with 512 Atom dual-core processors. ● AMD Fusion: CPU + GPU. ● The GPU may not stay forever, but massively-multithreaded is definitely the future of computing.19-Jan-2011 Computing Students talk 27
- 28. Where to start? ● Check your PC. – If its not at the age of being able to go to a Primary school, theres a high chance it has a GPU. ● Go to NVIDIA/ATI website, download some development toolkit, and youre ready to go.19-Jan-2011 Computing Students talk 28
- 29. THANK YOU ● Any questions? Just ask. ● Any suggestion? What are you waiting for. ● Any problem or solution to discuss? Lets have a private talk somewhere (j/k)19-Jan-2011 Computing Students talk 29

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment