The document discusses attention models for sequence to sequence learning. It introduces attention mechanisms that allow a model to focus on specific parts of the input sequence when generating each token of the output sequence. Examples are given of attention models for neural machine translation and image caption generation, including the computation of attention weights and visualization of attention maps.