Introduction to cuda geek camp singapore 2011

3,006 views
2,875 views

Published on

This presentation is for Geek Camp Singapore 2011 1st October

Published in: Technology, Sports
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,006
On SlideShare
0
From Embeds
0
Number of Embeds
22
Actions
Shares
0
Downloads
23
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Introduction to cuda geek camp singapore 2011

  1. 1. INTRODUCTION TO CUDAPrepared for Geek Camp Singapore 2011 Raymond Tay
  2. 2. THE FREE LUNCH IS OVER – HERBSUTTER
  3. 3. WE NEED TO THINK BEYOND MULTI-CORECPUS … WE NEED TO THINK MANY-COREGPUS…
  4. 4. NVIDIA GPUS FPS  FPS – Floating-point per second aka flops. A measure of how many flops can a GPU do. More is Better  GPUs beat CPUs
  5. 5. NVIDIA GPUS MEMORY BANDWIDTH  With massively parallel processors in Nvidia’s GPUs, providing high memory bandwidth plays a big role in high performance computing. GPUs beat CPUs
  6. 6. GPU VS CPUCPU GPU"   Optimised for low-latency "   Optimised for data-parallel, access to cached data sets throughput computation"   Control logic for out-of-order "   Architecture tolerant of and speculative execution memory latency "   More transistors dedicated to computation
  7. 7. I DON’T KNOW C/C++, SHOULD I LEAVE? Your Brain Asks: Wait a minute, why  Relax, no worries. Not to fret. should I learn the C/ C++ SDK? CUDA Answers: Efficiency!!!
  8. 8. WHAT DO I NEED TO BEGIN WITH CUDA?  A Nvidia CUDA enabled graphics card e.g. Fermi
  9. 9. HOW DOES CUDA WORK PCI Bus1.  Copy input data from CPU memory to GPU memory2.  Load GPU program and execute, caching data on chip for performance3.  Copy results from GPU memory to CPU memory
  10. 10. EXAMPLE: BLOCK CYPHERvoid host_shift_cypher(unsigned int *input_array, __global__ void shift_cypher(unsigned int unsigned int *output_array, unsigned int *input_array, unsigned int *output_array, shift_amount, unsigned int alphabet_max, unsigned int shift_amount, unsigned int unsigned int array_length) alphabet_max, unsigned int array_length) { { for(unsigned int i=0;i<array_length;i++) unsigned int tid = threadIdx.x + blockIdx.x * { blockDim.x; int element = input_array[i]; int shifted = input_array[tid] + shift_amount; int shifted = element + shift_amount; if ( shifted > alphabet_max ) if(shifted > alphabet_max) shifted = shifted % (alphabet_max + 1); { shifted = shifted % (alphabet_max + 1); output_array[tid] = shifted; } } output_array[i] = shifted; } Int main() { } dim3 dimGrid(ceil(array_length)/block_size); Int main() { dim3 dimBlock(block_size); host_shift_cypher(input_array, output_array, shift_cypher<<<dimGrid,dimBlock>>>(input_array, shift_amount, alphabet_max, array_length); output_array, shift_amount, alphabet_max,} array_length); } CPU GPU Program Program
  11. 11. EXAMPLE: VECTOR ADDITION // CUDA CODE__global__ void VecAdd(const float* A, const float* B, float* C, unsigned int N){ int i = blockDim.x * blockIdx.x + threadIdx.x; if (i < N) C[i] = A[i] + B[i];}// C CODEvoid VecAdd(const float* A, const float* B, float* C,unsigned int N){ for( int i = 0; i < N; ++i) C[i] = A[i] + B[i];}
  12. 12. DEBUGGER CUDA-GDB • Based on GDB • Linux • Mac OS X Parallel Nsight • Plugin inside Visual Studio
  13. 13. VISUAL PROFILER & MEMCHECK Profiler •  Microsoft Windows •  Linux •  Mac OS X •  Analyze Performance CUDA-MEMCHECK •  Microsoft Windows •  Linux •  Mac OS X •  Detect memory access errors
  14. 14. WHERE’S CUDA AT IN 2011?  60,000 researchers use it to aid drug discovery  470 universities teach CUDA
  15. 15. WHERE’S CUDA AT IN 2011? (PART 2..)  NVIDIA Show Case (1000+ applications)
  16. 16. ADDITIONAL RESOURCES  CUDA FAQ (http://tegradeveloper.nvidia.com/cuda-faq)  CUDA Tools & Ecosystem ( http://tegradeveloper.nvidia.com/cuda-tools-ecosystem)  CUDA Downloads (http://tegradeveloper.nvidia.com/cuda-downloads)  NVIDIA Forums (http://forums.nvidia.com/index.php?showforum=62)  GPGPU (http://gpgpu.org )  CUDA By Example ( http://tegradeveloper.nvidia.com/content/cuda-example-introduction- general-purpose-gpu-programming-0)   Jason Sanders & Edward Kandrot  GPU Computing Gems Emerald Edition ( http://www.amazon.com/GPU-Computing-Gems-Emerald-Applications/dp/ 0123849888/ )   Editor in Chief: Prof Hwu Wen-Mei
  17. 17. CUDA LIBRARIES  Visit this site http://developer.nvidia.com/cuda-tools- ecosystem#Libraries  Thrust, CUFFT, CUBLAS, CUSP, NPP, OpenCV, GPU AI-Tree Search, GPU AI-Path Finding  A lot of the libraries are hosted in Google Code. Many more gems in there too!
  18. 18. THANK YOU @RaymondTayBL

×