This document discusses model-free continuous control in reinforcement learning. It introduces policy gradient and value gradient methods for learning continuous control policies with neural networks. Policy gradient methods directly optimize the policy parameters with the policy gradient. Value gradient methods optimize the policy based on the gradient of a learned state-action value function. Actor-critic methods combine policy gradients with value functions to reduce variance.