Synthetic dialogue generation with Deep Learning

Synthetic Dialogue
Generation with Deep
Learning
Using TensorFlow
Nasar, Syed | August 2017

About me
Machine Learning and Big Data Architect
MS Computer Science, Machine Learning, Georgia Tech
Deep Learning Nanodegree, Udacity
Artificial Intelligence Nanodegree, Udacity
Connect
 Twitter : @techmazed
 LinkedIn : https://www.linkedin.com/in/knownasar
 Founder: Nashville Artificial Intelligence Society (twitter @AINashville)

What we’ll cover
The synthetic data generation challenge
Introducing Deep Learning
Introducing RNN & Sequence to Sequence model
Introducing NLP concepts
Brief introduction to TensorFlow & FloydHub
The solution approach
Code walkthrough
What next!
Question time

TheSyntheticDataGeneration Challenge

The problem
The challenge is to generate our own TV dialogue scripts by training a Neural Network with an existing
corpus of dialogues.
Dialogues from Simpson TV show:
The sentences 0 to 10:
[YEAR DATE 1989] Â© Twentieth Century Fox Film Corporation. All rights reserved.
Moe_Szyslak: (INTO PHONE) Moe's Tavern. Where the elite meet to drink.
Bart_Simpson: Eh, yeah, hello, is Mike there? Last name, Rotch.
Moe_Szyslak: (INTO PHONE) Hold on, I'll check. (TO BARFLIES) Mike Rotch. Mike Rotch. Hey, has anybody
seen Mike Rotch, lately?
Moe_Szyslak: (INTO PHONE) Listen you little puke. One of these days I'm gonna catch you, and I'm gonna
carve my name on your back with an ice pick.
Moe_Szyslak: What's the matter Homer? You're not your normal effervescent self.
Homer_Simpson: I got my problems, Moe. Give me another one.
Moe_Szyslak: Homer, hey, you should not drink to forget your problems.
Barney_Gumble: Yeah, you should only drink to enhance your social skills.

Introduction to
the important
concepts
Concepts
A brief discussion on the various
technologies, techniques and
concepts that we apply towards
the solution.
Deep
Learning
Recurrent
Neural
Network
(RNN)
Natural
Language
Processing
(NLP)
Technologies

What is Deep
Learning?
Neural networks with large number of
parameters and layers.
The layers belong to one of four
fundamental network architectures:
 Unsupervised Pre-Trained Networks
 Convolutional Neural Networks
 Recurrent Neural Networks
 Recursive Neural Networks
 Also Generative Adversarial
Networks
Introducing Deep Learning
The behavior of neural networks is shaped by its network architecture.
A network’s architecture can be defined, in part, by:
• number of neurons
• number of layers
• types of connections between layers
A Neural Network is an algorithm
that learns to identify patterns in
data.
Backpropagation is a technique to
train a Neural Net by updating
weights via gradient descent.
Deep Learning:
many layer neural net + massive
data (big data) + massive compute
(GPU)
Multi-Layer Neural Network topology

Backpropagation
Learning
Backpropagation is an important part of
reducing error in a neural network
model.
General Neural Network Training
Pseudocode
function neural-network-learning( training-records ) returns network
network <- initialize weights (randomly)
start loop
for each example in training-records do
network-output = neural-network-output( network, example )
actual-output = observed outcome associated with example
update weights in network based on { example, network-output, actual-output }
end for
end loop when all examples correctly predicted or hit stopping conditions
return network

IntroducingRecurrentNeural Network

What is RNN?
Recurrent Neural Networks have
loops.
In the diagram (right), a chunk of neural
network, A, looks at some input xt and
outputs a value ht.
A loop allows information to be passed
from one step of the network to the next.
Consider what happens if we unroll the
loop:

The flow
Recurrent Neural Network

IntroducingNatural LanguageProcessing(NLP) concepts

Tokenize
Given a character sequence and a defined document unit, tokenization is the task of
chopping it up into pieces, called tokens, also sometimes throwing away certain
characters, such as punctuation.
Here is an example:
Input: Friends, Romans, Countrymen, lend me your ears;
Output: Friends || Comma || Romans || Comma || Countrymen || Comma || lend || me || your || ears || semicolon ||
bye bye!

One-Hot Encoding
One hot encoding technique is used to encode categorical integer features using a one-
hot scheme.

LSTMs
This basically handles the problem of vanishing and exploding gradients. LSTMs help
preserve the error that can be backpropagated through time and layers.
Image courtesy - http://deeplearning.net/tutorial/lstm.html

Logits
Logistic Function:
The logit function is the inverse of the
logistic function.

SoftMax
MulticlassClassificationModelOutputLayer

Brief introductionto technologies used

TensorFlow
TensorFlow is an open source software library by Google
To design, build, and train deep learning models
Why use:
Thoughtful design & Ease of use
Deploy computation to one or more CPUs or GPUs
TensorFlow alternatives: Theano, Torch, Caffe, Neon, and Keras

FloydHub
“Floydhub is Heroku for Deep Learning. A Platform-as-a-Service for training
and deploying your models in the cloud with a few simple commands.”
# Install locally and login
$pip install -U floyd-cli
$floyd login
# Run a Jupyter notebook on FloydHub
$ floyd run --mode jupyter --gpu
#Run a python code on GPU
$ floyd run --gpu "python mnist_cnn.py"

Demo Time
Let us look at the code

Get the helper function.
Load the data
Summary of the data
Sample Data

Create a word embedding
Transform the words to ids.
Two dictionaries:
• Dictionary to go from the words to
an id, we'll call vocab_to_int
• Dictionary to go from the id to
word, we'll call int_to_vocab
Look at vocab-to-int

Tokenize Punctuation
This dictionary will be used to token
the symbols and add the delimiter (||)
around it.
This separates the symbols as it's own
word, making it easier for the neural
network to predict on the next word.
Look at token dictionary

Let’s preprocess the data
Look at the pre-processed data

Build the Neural Network
We'll build the components
necessary to build a RNN by
implementing these functions
• get_inputs
• get_init_cell
• get_embed
• build_rnn
• build_nn
• get_batches Placeholders for Inputs

• get_inputs
• get_init_cell
• get_embed
• build_rnn
• build_nn
• get_batches
Build RNN Cell and Initialize

• get_inputs
• get_init_cell
• get_embed
• build_rnn
• build_nn
• get_batches
Word Embedding

• get_inputs
• get_init_cell
• get_embed
• build_rnn
• build_nn
• get_batches
Build RNN
Name for the
operation
A tensor

Apply embedding Build RNN
Apply a fully
connected layer
Return the logits and
final state
• get_inputs
• get_init_cell
• get_embed
• build_rnn
• build_nn
• get_batches

• get_inputs
• get_init_cell
• get_embed
• build_rnn
• build_nn
• get_batches
Create Batches
•The first element is a single batch of input with the shape [batch size, sequence length]
•The second element is a single batch of targets with the shape [batch size, sequence length]

Neural Network Training
Number of epochs = 50
Batch size = 256
Size of the RNNs = 256
Length of sequence = 10
Learning rate = 0.01
Tuning Hyperparameters

Get Inputs
Initialize cells
Build NN => logits
Apply Softmax
Calculate Loss
Compute Gradients
Apply Optimizer
Build the Graph
Loss function
Optimizer

Get batches
Initialize
variables
Initialize state
Enumerate
batches
Get inputs and
hyperparameters
Compute loss
Show batch
results every 100
Save trained
model
Train the neural network

Training output

Get Tensors
Script Generation
Choose Word

Generate TV Script
Script Generation
Load Model
Get Tensors
Setup Sentence
generation
Generate sentences
Dynamic input
Get prediction

Generated Script
Larger Dataset: https://www.kaggle.com/wcukierski/the-simpsons-by-the-data

Resources
The code associated with this presentation is available
here:
https://github.com/syednasar/talks/tree/master/synthetic-dialog
In case of any questions on the code or the
implementation, reach out at:
Email: techtalk@nasars.com or
Twitter: @techmazed

References
&
Recommendations
•Visualizing and Understanding Recurrent Networks
(paper): https://arxiv.org/abs/1506.02078
•Blog by Andrej Karpathy:
http://karpathy.github.io/2015/05/21/rnn-
effectiveness/

Keep(deep)learning!
@techmazed

Synthetic dialogue generation with Deep Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Synthetic dialogue generation with Deep Learning

Similar to Synthetic dialogue generation with Deep Learning (20)

Recently uploaded

Recently uploaded (20)

Synthetic dialogue generation with Deep Learning