1) The document introduces ELMo (Embeddings from Language Models), a new type of deep contextualized word representation that models both the complex characteristics of word use and how these uses vary across linguistic contexts. 2) ELMo representations are learned as functions of the internal states of a pre-trained deep bidirectional language model, which is pre-trained with an unsupervised language modeling objective on a large text corpus. 3) Unlike previous approaches, ELMo representations are deep in that they are a function of all the internal layers of the bidirectional language model.