This document discusses Maximum A Posteriori (MAP) estimation for machine learning and data mining. It begins by introducing the Bayesian rule and defining the MAP as the value of Θ that maximizes the posterior p(Θ|X). It then shows how to develop the MAP solution by taking the logarithm of the posterior and finding the value of Θ that maximizes it. The MAP allows prior beliefs about parameter values to be incorporated into the estimation. An example application to binary classification with a Bernoulli model is provided. It derives the maximum likelihood solution and then extends it to the MAP by specifying a Beta prior distribution over the parameter.