CUDAArchitecture
Prof. Shashikant V. Athawale
Assistant Professor | Computer Engineering
Department | AISSMS College of Engineering,
Kennedy Road, Pune , MH, India - 411001
Contents
❖ CUDAArchitecture
❖ Applications of CUDA
❖ Introduction to CUDA C-Write and launch CUDA C
kernels
❖ Manage GPU memory
❖ Manage communication and synchronization
❖ Parallel programming in CUDA- C.
Communication And Synchronization in
Thread
Communication And Synchronization in
Thread
CUDAArchitecture
CUDAArchitecture
Applications of CUDA
CUDA C : The Basics
❖ Based on industry-standard C
❖ A handful of language extensions to allow heterogeneous
programs
❖ Straightforward APIs to manage devices, memory, etc.
❖ Terminology:
➢ Host – The CPU and its memory (host memory)
➢ Device – The GPU and its memory (device memory)
Device
CUDA Kernels
GPU Memory Management
Data Transfer Directions Keywords
❖ cudaMemcpyHostToHost
❖ cudaMemcpyHostToDevice
❖ cudaMemcpyDeviceToHost
❖ cudaMemcpyDeviceToDevice
Parallel Programming in CUDA C
❖ CUDA brings data-parallel computing to the masses.
❖ CUDA is a scalable parallel programming model.
❖ Program runs on any number of processors without
recompiling.
Architecture Of Parallel CUDA Programming
CUDA Uses Extensive Multithreading
❖ CUDA threads express fine-grained data parallelism.
➢ Map threads to GPU threads.
➢ Virtualize the processors.
❖ CUDA thread blocks express coarse-grained parallelism.
➢ Blocks hold arrays of GPU threads, define shared
memory boundaries.
➢ Allow scaling between smaller and larger GPUs.
CUDA Uses Extensive Multithreading
❖ GPUs execute thousands of lightweight threads.
➢ In graphics, each thread computes one pixel.
➢ One CUDA thread computes one result (or several
results).
➢ Hardware multithreading & zero-overhead
scheduling.
Applications
❖ High bandwidth
❖ Visual computing
❖ High arithmetic intensity

CUDA Architecture

  • 1.
    CUDAArchitecture Prof. Shashikant V.Athawale Assistant Professor | Computer Engineering Department | AISSMS College of Engineering, Kennedy Road, Pune , MH, India - 411001
  • 2.
    Contents ❖ CUDAArchitecture ❖ Applicationsof CUDA ❖ Introduction to CUDA C-Write and launch CUDA C kernels ❖ Manage GPU memory ❖ Manage communication and synchronization ❖ Parallel programming in CUDA- C.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
    CUDA C :The Basics ❖ Based on industry-standard C ❖ A handful of language extensions to allow heterogeneous programs ❖ Straightforward APIs to manage devices, memory, etc. ❖ Terminology: ➢ Host – The CPU and its memory (host memory) ➢ Device – The GPU and its memory (device memory) Device
  • 9.
  • 10.
  • 11.
    Data Transfer DirectionsKeywords ❖ cudaMemcpyHostToHost ❖ cudaMemcpyHostToDevice ❖ cudaMemcpyDeviceToHost ❖ cudaMemcpyDeviceToDevice
  • 12.
    Parallel Programming inCUDA C ❖ CUDA brings data-parallel computing to the masses. ❖ CUDA is a scalable parallel programming model. ❖ Program runs on any number of processors without recompiling.
  • 13.
    Architecture Of ParallelCUDA Programming
  • 14.
    CUDA Uses ExtensiveMultithreading ❖ CUDA threads express fine-grained data parallelism. ➢ Map threads to GPU threads. ➢ Virtualize the processors. ❖ CUDA thread blocks express coarse-grained parallelism. ➢ Blocks hold arrays of GPU threads, define shared memory boundaries. ➢ Allow scaling between smaller and larger GPUs.
  • 15.
    CUDA Uses ExtensiveMultithreading ❖ GPUs execute thousands of lightweight threads. ➢ In graphics, each thread computes one pixel. ➢ One CUDA thread computes one result (or several results). ➢ Hardware multithreading & zero-overhead scheduling.
  • 16.
    Applications ❖ High bandwidth ❖Visual computing ❖ High arithmetic intensity