This document discusses Deep Deterministic Policy Gradient (DDPG), a reinforcement learning algorithm for problems with continuous state and action spaces. DDPG uses an actor-critic method with experience replay and soft target updates to learn a policy in an off-policy manner. It demonstrates how DDPG can be used to train an agent to drive a vehicle in a simulator by designing a reward function, but notes that designing effective rewards, avoiding local optima, instability, and data requirements are challenges for DDPG.