GPU Programming

875 views
834 views

Published on

Introduction to high performance computing with graphics cards

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
875
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

GPU Programming

  1. 1. CPU Architecture <ul><li>good for serial programs </li></ul><ul><li>do many different things well </li></ul><ul><li>many transistors for purposes other than ALUs (eg. flow control and caching) </li></ul><ul><li>memory access is slow (1GB/s) </li></ul><ul><li>switching threads is slow </li></ul>Image from Alex Moore, “Introduction to Programming in CUDA”, http://astro.pas.rochester.edu/~aquillen/gpuworkshop.html
  2. 2. GPU Architecture <ul><li>many processors perform similar operations on a large data set in parallel (single-instruction multiple-data parallelism) </li></ul><ul><li>recent GPUs have around 30 multiprocessors, each containing 8 stream processors </li></ul><ul><li>GPUs devote most (80%) of their transistors to ALUs </li></ul><ul><li>fast memory (80GB/s) </li></ul>ALUs Control Cache
  3. 3. Memory Hierarchy Image from Johan Seland, “CUDA Programming”, http://heim.ifi.uio.no/~knutm/geilo2008/seland.pdf
  4. 4. Thread Hierarchy <ul><li>a block of threads runs on a single stream processor </li></ul><ul><li>a grid of blocks makes up the entire set </li></ul><ul><li>each thread in a block can access the same shared memory </li></ul><ul><li>many more threads than processors </li></ul>Image from Johan Seland, “CUDA Programming”, http://heim.ifi.uio.no/~knutm/geilo2008/seland.pdf
  5. 5. CUDA <ul><li>a set of C extensions for running programs on a GPU </li></ul><ul><li>Windows, Linux, Mac…. Nvidia Cards only </li></ul><ul><li>http://www.nvidia.com/object/cuda_home.html </li></ul><ul><li>relatively easy to convert algorithms to CUDA, look at loops that do the same calculation on an entire array </li></ul><ul><li>- gives you direct access to the memory architecture </li></ul>
  6. 6. Results Image from Kevin Dale, “A Graphics Hardware-Accelerated Real-Time Processing Pipeline for Radio Astronomy”, Presented at AstroGPU, Nov 2007. (for tasks relevant to the MWA Real Time System)

×