1) The document discusses contrastive divergence learning, a method for training probabilistic models using gradient descent. 2) It involves using Markov chain Monte Carlo sampling to approximate gradients that are intractable, by running a short Markov chain to move model samples from the data distribution to the model distribution. 3) There is a potential for bias compared to true maximum likelihood learning, as contrastive divergence approximates minimizing the Kullback-Leibler divergence between the data and model after one step of MCMC, rather than the full distributions.