Representation Learning for
NLP: Deep Dive
Anuj Gupta, Satyam Saxena
• Duration : 6 hrs
• Level : Intermediate to Advanced
• Objective: For each of the topics, we will dig into the concepts,
maths to build a theoretical understanding; followed by code
(jupyter notebooks) to understand the implementation details.
Module 1 (30 mins)
• Introduction to Text Representation (5 mins)
• Old ways of representing text (20 mins)
• Bag-Of-Words
• TF–IDF
• Co-occurrence matrix + SVD
• Pros and Cons
• Introduction to Embedding spaces (5 mins)
Outline/Time Map - 4 Modules
Module 2 (160 mins)
• Word-Vectors
• Introduction + Bigram model (25 mins)
• CBOW model (25 mins)
• SKIP-GRAM model (25 mins)
[Efficient estimation of word representations in vector space. Mikolov, et. al.
ICLR Workshop, 2013]
• Speed-Up (20 mins)
• Negative Sampling
• Hierarchical Softmax
[Distributed representations of words and phrases and their compositionality.
Mikolov, et. al. ANIPS, 2013]
• Word-Vectors (contd)
• GLOVE model (30 mins)
[GloVe: Global Vectors for Word Representation. Pennington et. al. EMNLP
2014]
• t-SNE (15 mins)
[Visualizing Data using t-SNE. Hinton et. al. 2008
How to Use t-SNE Effectively – Distill]
• Pros and Cons of using pre-trained word vectors (5 mins)
• Q & A (20 mins)
Module 3 (70 mins)
• Sentence2vec/Paragraph2vec/Doc2Vec
• Introduction (5 mins)
• PV-DM model (35 mins)
• PV-DBOW model
[Distributed representations of sentences and documents. Mikolov, et. al. ICML,
2014]
• Skip-Thoughts model (20 mins)
[Skip-Thought Vectors. Kiros et. al. arXiv preprint 2015]
• Pros and Cons (10 mins)
Module 4 (70 mins)
• Char2Vec
• Introduction (5 mins)
• Introduction to RNNs, LSTMs (20 mins)
• 1-hot Encoding (30 mins)
[The Unreasonable Effectiveness of Recurrent Neural Networks. Andrej Karpathy 2015]
• Character Embeddings (20 mins)
[Character-Aware Neural Language Models. Yoon Kim et. al. AAAI 2015]
• Pros and Cons (5 mins)
• Q & A (10 mins)

Representation Learning for NLP

  • 1.
    Representation Learning for NLP:Deep Dive Anuj Gupta, Satyam Saxena
  • 2.
    • Duration :6 hrs • Level : Intermediate to Advanced • Objective: For each of the topics, we will dig into the concepts, maths to build a theoretical understanding; followed by code (jupyter notebooks) to understand the implementation details.
  • 3.
    Module 1 (30mins) • Introduction to Text Representation (5 mins) • Old ways of representing text (20 mins) • Bag-Of-Words • TF–IDF • Co-occurrence matrix + SVD • Pros and Cons • Introduction to Embedding spaces (5 mins) Outline/Time Map - 4 Modules
  • 4.
    Module 2 (160mins) • Word-Vectors • Introduction + Bigram model (25 mins) • CBOW model (25 mins) • SKIP-GRAM model (25 mins) [Efficient estimation of word representations in vector space. Mikolov, et. al. ICLR Workshop, 2013] • Speed-Up (20 mins) • Negative Sampling • Hierarchical Softmax [Distributed representations of words and phrases and their compositionality. Mikolov, et. al. ANIPS, 2013]
  • 5.
    • Word-Vectors (contd) •GLOVE model (30 mins) [GloVe: Global Vectors for Word Representation. Pennington et. al. EMNLP 2014] • t-SNE (15 mins) [Visualizing Data using t-SNE. Hinton et. al. 2008 How to Use t-SNE Effectively – Distill] • Pros and Cons of using pre-trained word vectors (5 mins) • Q & A (20 mins)
  • 6.
    Module 3 (70mins) • Sentence2vec/Paragraph2vec/Doc2Vec • Introduction (5 mins) • PV-DM model (35 mins) • PV-DBOW model [Distributed representations of sentences and documents. Mikolov, et. al. ICML, 2014] • Skip-Thoughts model (20 mins) [Skip-Thought Vectors. Kiros et. al. arXiv preprint 2015] • Pros and Cons (10 mins)
  • 7.
    Module 4 (70mins) • Char2Vec • Introduction (5 mins) • Introduction to RNNs, LSTMs (20 mins) • 1-hot Encoding (30 mins) [The Unreasonable Effectiveness of Recurrent Neural Networks. Andrej Karpathy 2015] • Character Embeddings (20 mins) [Character-Aware Neural Language Models. Yoon Kim et. al. AAAI 2015] • Pros and Cons (5 mins) • Q & A (10 mins)