This document summarizes work done to optimize pedestrian detection using histograms of oriented gradients (HOG) on an NVIDIA Tegra X1 mobile GPU. The optimizations included improving instruction level parallelism, using approximations like lower precision, and specializing parts of the algorithm. These optimizations resulted in an overall 1.87x speedup compared to the original implementation, achieving 214 frames per second on Tegra X1.