The document discusses a variance reduction technique called clipped action policy gradient for policy gradient methods with bounded action spaces. Policy gradient methods typically use policies with unbounded support even when the action space is bounded, which can lead to a mismatch. This technique clips actions to the bounds of the action space in order to reduce the variance of policy gradient estimates.