Deep contextualized word representations introduces ELMo (Embeddings from Language Models), word representations that model the complex characteristics of word use and how they vary across linguistic contexts. Evaluating ELMo on six NLP tasks establishes new state-of-the-art results. ELMo improves performance by adding contextualized representations from a bidirectional language model to the input and output layers of task-specific models.