This document serves as an introduction to writing CUDA C programs for GPU computation using the CUDA runtime API. It discusses the differences between programming in C and CUDA, specifically focusing on kernel functions, memory management, and the process of data transfer between host (CPU) and device (GPU) memory. Additionally, the document presents step-by-step guidance for converting a vector addition example from C to CUDA, highlighting the necessary modifications to manage memory and execute kernels effectively.