The document addresses the challenge of handling rare and unknown words in natural language processing (NLP) systems, proposing a novel solution using neural networks and attention mechanisms. The model features two softmax layers to predict outputs from a predefined shortlist of the most frequent words, while unknown words are represented by a special token. The authors highlight fundamental issues with the shortlist approach and discuss the architecture of the proposed neural machine translation model.