This document discusses GPU programming with CUDA. It begins with an introduction to Nvidia graphics cards and the CUDA programming model. It then covers Nvidia GPU architecture such as the evolution of GPU generations from Tesla to Volta. The CUDA programming model is also summarized, including its use of kernels, threads, and memory access. Finally, a case study on implementing dot product parallelization on a GPU is presented to demonstrate CUDA programming.