Word Embeddings and RNN
techniques
Contents
• Word Embeddings
• Application of Sequence modelling
• RNN
• GRU
• LSTM
• Attention Mechanism
Word Embeddings
• Continuous Bag of Words (CBOW)- Predict center word from (bag of)
context words
• Skip-grams- Predict context (”outside”) words (position independent)
given center word
• Glove : Co-occurrence statistics in a form of word co occurrence
matrix X
Applications of Sequence Modelling
• Text Classification
• Language Modeling
• Speech Recognition
• Caption Generation
• Machine Translation
• Document Summarization
• Music Generation
• Sentiment Classification
• Handwriting Recognition
RNN
• Sequence Modeling i.e. Exhibit dynamic temporal behavior for a time
sequence
• A recurrent neural network (RNN) is a class of artificial neural network
where connections between units form a directed graph along a
sequence.
Why RNN
• Traditional Method - Model is usually conditioned on window of n
words.
• Variable width inputs.
• Bidirectional RNNs
• Shared Parameters.
Different types of RNNs
General Model
Machine Translation Model
Drawbacks of RNN
• The vanishing/exploding gradient problem
• Example - Jane walked into the room. John walked in too. It was late
in the day. Jane said hi to ____
• RNNs aren't good in long term dependencies.
Gated Recurrent Units (GRU)
• Update gate
• Reset gate
• New memory content
• Final memory at time step combines current and previous time steps:
Long Short Term Memory (LSTM)
• Input gate (current cell matters)
• Forget (gate 0, forget past)
• Output (how much cell is exposed)
• New memory cell
• Final memory cell
• Final hidden state
Attention mechanism

Word embeddings, RNN, GRU and LSTM

  • 1.
    Word Embeddings andRNN techniques
  • 2.
    Contents • Word Embeddings •Application of Sequence modelling • RNN • GRU • LSTM • Attention Mechanism
  • 3.
    Word Embeddings • ContinuousBag of Words (CBOW)- Predict center word from (bag of) context words • Skip-grams- Predict context (”outside”) words (position independent) given center word • Glove : Co-occurrence statistics in a form of word co occurrence matrix X
  • 4.
    Applications of SequenceModelling • Text Classification • Language Modeling • Speech Recognition • Caption Generation • Machine Translation • Document Summarization • Music Generation • Sentiment Classification • Handwriting Recognition
  • 5.
    RNN • Sequence Modelingi.e. Exhibit dynamic temporal behavior for a time sequence • A recurrent neural network (RNN) is a class of artificial neural network where connections between units form a directed graph along a sequence.
  • 6.
    Why RNN • TraditionalMethod - Model is usually conditioned on window of n words. • Variable width inputs. • Bidirectional RNNs • Shared Parameters.
  • 7.
  • 8.
  • 9.
  • 10.
    Drawbacks of RNN •The vanishing/exploding gradient problem • Example - Jane walked into the room. John walked in too. It was late in the day. Jane said hi to ____ • RNNs aren't good in long term dependencies.
  • 11.
    Gated Recurrent Units(GRU) • Update gate • Reset gate • New memory content • Final memory at time step combines current and previous time steps:
  • 12.
    Long Short TermMemory (LSTM) • Input gate (current cell matters) • Forget (gate 0, forget past) • Output (how much cell is exposed) • New memory cell • Final memory cell • Final hidden state
  • 13.