This document proposes the Gumbel-Softmax distribution as a way to differentially sample from a categorical distribution. It describes how categorical variables are non-differentiable to train with backpropagation. The REINFORCE algorithm uses the likelihood ratio to estimate gradients but has high variance. Gumbel-Softmax approximates the categorical with a continuous relaxation using the Gumbel-Max trick, allowing gradients to flow. It shows this continuous relaxation performs similarly to the discrete categorical distribution but is differentiable, enabling lower-variance training of models with categorical latent variables.