The document presents an overview of a parallel n-body code implementation, comparing serial, OpenMP, and CUDA approaches. Key insights include the computational challenges of the n-body problem, with emphasis on efficiency improvements in force calculations and memory usage in GPU architecture. Experimental results demonstrate significant cost reductions achieved with the CUDA implementation compared to serial and OpenMP versions.