Smart Reply - Word-level Sequence to sequence.pptx

Smart Reply - Word-level
Sequence to sequence

Problem formulation
• Topical chat Dataset
This file contains information
about
• conversation_id
• Message
• Sentiment
• To implement a basic character-level recurrent
sequence-to-sequence model.
• We apply it get the short English sentences for smart
reply character-by-character.

Summary of the algorithm
• We start with input sequences from a domain and corresponding target
sequences from another domain .
• An encoder LSTM turns input sequences to 2 state vectors (we keep the
last LSTM state and discard the outputs).
• A decoder LSTM is trained to turn the target sequences into the same
sequence but offset by one time step in the future, a training process
called "teacher forcing" in this context. It uses as initial state the state
vectors from the encoder. Effectively, the decoder learns to
generate targets[t+1...] given targets[...t], conditioned on the input
sequence.
• In inference mode, when we want to decode unknown input sequences,
we: - Encode the input sequence into state vectors - Start with a target
sequence of size 1 (just the start-of-sequence character) - Feed the state
vectors and 1-char target sequence to the decoder to produce
predictions for the next character - Sample the next character using
these predictions (we simply use argmax). - Append the sampled
character to the target sequence - Repeat until we generate the end-of-
sequence character or we hit the character limit.

3. Load the Dataset
4. Cleaning the data

• Number of Iterations = 186952
• Time taken to run the loop = 00:38<00:00
• The progress is being tracked using tqdm library.
1. Vectorize the data.
2. Set the limitation of text and append them in variable to vectorize.
3. Put BOS and EOS for decoder input.
4. Separate unique words.

• Create word index dictionary, word as the key and index as
the value.
• Store the object to ‘handle’ variable both for input sequence
and target sequence.
• Initialize encoder/decoder, Both input output are three
dimensional tensors, consisting of each sentence, with all
words encoded in one-hot.

Training and validation loss metrics observed during the training
phase.

Conclusion
• The project attempts to predict the next character or
word in the sequence.
• This project generate the predictions by sending in
the text.
• Although this model hasn't reached human-level
intelligence!
• To attain more accurate results from this model, we
can increase the size of the dataset.

Reference
• https://keras.io/examples/nlp/lstm_seq2seq/
• https://alvinntnu.github.io/python-notes/nlp/seq-to-
seq-machine-translation.html
• https://blog.paperspace.com/implement-seq2seq-
for-text-summarization-keras/

Smart Reply - Word-level Sequence to sequence.pptx

More Related Content

Similar to Smart Reply - Word-level Sequence to sequence.pptx

Recently uploaded

Smart Reply - Word-level Sequence to sequence.pptx