Conversational Agents in Portuguese: A Study Using Deep Learning

Conversational Agents in Portuguese
A Study Using DeepLearning

Andherson C. Maeda
Developer Consultant @ thoughtworks
$ whoami

Agenda
● Chatbots
● Machine Learning
● Context
● Research
● Results
● Final Considerations
● Future Work

Chatbots timeline
60's: Eliza
Simulation of
rogerian
psychologist. Use
pieces of input to
build output phrase
(Weizenbaum)
90's: Julia
TinyMUD bot. Based
on rules recognition
and neural networks
(Mauldin)
2000: Alice
Pattern matching.
Inspired by XML:
AIML (Wallace)
Neural Networks

Chatbots - types
◉ Tasks oriented
○ Cognitive architecture
○ Tasks accomplishment objective
○ Recognize natural language
◉ Chatter bots
○ Reactive architecture
○ Human conversation imitation
○ Wide domain

Chatbots - relevance
◉ Its abilities to dialog in natural language make
them attractive
◉ They could be available 24x7 and provides a
more natural interface for apps
◉ Big companies like Apple, Microsoft, Samsung,
Amazon and Google has been launching
products in this direction

Caveats (or not)
◉ Handcrafted rules
◉ Unknown questions will generate default
answers
◉ Manual effort to build and maintain the
conversation's database over the time
○ Are there alternatives?

Machine Learning
Deep Learning and so on..

Definition
What's Machine Learning and Deep Learning?
6

“
A learning definition:
"A computer program is said to learn from experience E with respect
to some class of tasks T and performance measure P if its
performance at tasks in T, as measured by P, improves with
experience E" (Mitchel, 1997)

“
A machine learning definition:
It's a usage of computational resources and learning algorithms
over a dataset to infer mathematical functions that represent the
dataset's information (Bengio and Courville, 2016)

“
A deep learning definition:
The main difference between a shallow learning and deep
learning it's a deep mathematical function composition. (Bengio
and Courville, 2016)

Deep Learning timeline
40's: Artificial
Neuron
It's is possible to use
a model based on
neuron to do logic
calculations.
(McCulloch and Pitts,
1943)
60's: Perceptron
Applied weights ideia
to artificial neuron
model. (Rosenblatt,
1958)
80's: XOR Problem and
Multilayer Perceptron
Perceptron works only for
linear problems (Minsky
and Pappert, 1988)
XOR problem resolved by
backpropagation algorithm
(Rumelhart at all, 1985)
Neural Networks

Deep Learning timeline
90's: Cyclical
Connections
Relaxed feedforward
conditions and allow
cyclical connections
on sequences over
the time (Graves,
2012)
90's: LSTM
Brings the gates to
address vanishing
and explosion
gradients problem
(Hochreiter, 1997)
2014: Sequence to
Sequence Model
Applied LSTM layers to
translation tasks between
english and french with
good results (Sutskever et
all, 2014)
Neural Networks

Context
From where this story has been started?
6

“
But… in english! Does it works in
portuguese?

Research
Have someone been made the same? Are there libraries?
7

Research
● Most work was made based on AIML
● All related papers are for english
● Lack of portuguese dialog corpora
● Lack of libraries and frameworks
○ Just one in lua! (neuralconvo)

Research - Corpora
● Cornell Movie-Dialogs Corpus (english)
○ 220,579 sentences
● OpenSubtitles (portuguese)
○ 2.6 billion of sentences!
○ Starwars Movie subtitles
■ 12,775 sentences
● Mobile App chat history (portuguese)
○ 30,684 sentences

Research - Neural Unit
● Input tensors:
○ one hot encoding
● Output tensors
○ one hot encoding
○ Negative log-likelihood

● Encoder
○ 1 layer LSTM
● Decoder
○ 1 layer LSTM
○ LogSoftMax layer
● Adam gradient descendent
○ Mini batches
Research - Neural Topology

Recurrent Neural Networks and LSTM block
Recurrent Neural Network Unfold Recurrent Neural Network LSTM block

Research - Processing
- Oi, como vai você?
- Bem, obrigado!
- Onde você está?
- Em casa
[
{
["5","4","3","2","1","0","10"],
["6","1","7","8","10","11","11"]
},
{
["11","11","8","7","1","6","10"],
["9","4","11","5","10","11","11"]
},
{
["9","4","11","5","10","11","11"]
["11","11","10","11","11","11","11"]
}
]
{
"0": "Oi",
"1": ",",
"2": "como",
"3": "vai",
"4": "voce",
"5": "?",
"6": "Bem",
"7": "obrigado",
"8": "!",
"9": "Onde",
"10": "<EOS>"
"11": "<UNK>"
}
[[
[0,0,0,0,0,1,0,0,0,0,0,0],
[0,0,0,0,1,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0,0,0,0,0],
[0,0,1,0,0,0,0,0,0,0,0,0],
[0,1,0,0,0,0,0,0,0,0,0,0],
[1,0,0,0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0,0,0,1,0]
],
[
[0,0,0,0,0,0,1,0,0,0,0,0],
[0,1,0,0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,1,0,0,0,0],
[0,0,0,0,0,0,0,0,1,0,0,0],
[0,0,0,0,0,0,0,0,0,0,1,0],
[0,0,0,0,0,0,0,0,0,0,0,1],
[0,0,0,0,0,0,0,0,0,0,0,1]
]]
Dictionary Dialog Pairs Tensors

Research - Processing
[
[0,0,0,0,0,1,0,0,0,0,0,0],
[0,0,0,0,1,0,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0,0,0,0,0],
[0,0,1,0,0,0,0,0,0,0,0,0],
[0,1,0,0,0,0,0,0,0,0,0,0],
[1,0,0,0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0,0,0,1,0]
]
[0,0,0,0,0,0,1,0,0,0,0,0] =
Bem
[0.1,0.002,0.01,0.03,0.11,0.09
,0.7,0.01,0.12,0.2,0.1,0.37]
Encoder Decoder
Recurrent autoencoder
Sentence context

● Corpus: 10,000 sentences
● Autoencoder: 1,000 units per layer
Research - English Corpus
Tests

Research - Portuguese Corpus
Tests

Research - StarWars Corpus
(TecnoPUC tests)
● Autoencoder: 400 units per layer
○ - 600 units to reduce costs

Research - Mobile App Chat
Tests
● Personality!

Research - Turn taking
● Since turn taking is not clearly identified,
mainly at subtitles, we trained our model to
predict the next sentence given the previous
one (Sutskever at all, 2015)
● Word generation stop at 20 words length

Results
● 90 users exchange has been used the bot
● 1,502 conversation pairs
● Users from:
○ Facebook
○ Software Engineering Classes
○ Coworkers
● 1 week test
● Users age between 17 and 62 years old

Users age range and their job area

Users profile and knowledge about chatbots

Users knowledge about StarWars and chatbots

Sample of User Dialog using StarWars tokens

Sample of User Dialog without StarWars tokens

Users age by group and individually

Users impression after chatting with bot

Final Considerations
● The results using movie subtitles, considering
the computational capacity used for and lack
of other AI techniques, were satisfactory.
● We have made an experiment using chat app
history and results was similar to Sutskever
and Vinyals results
○ Personality traits!

Future Work
● Try to use more AI techniques like
○ Dimensionality reduction
○ Applying some NPL preprocess
○ More hidden layers and so on..
● Try another turn taking heuristic
● Experiment new frameworks like PyTorch
● Evaluate it with other corpora in portuguese

Any questions ?
You can find me at
◉ @maedabr
◉ maeda.br@gmail.com
◉ amaeda@thoughtworks.com
◉ andherson.maeda@acad.pucrs.br
Thanks!
http://bagu.al/1Bn
Paper published at KDMiLe

Conversational Agents in Portuguese: A Study Using Deep Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Conversational Agents in Portuguese: A Study Using Deep Learning

Similar to Conversational Agents in Portuguese: A Study Using Deep Learning (20)

Recently uploaded

Recently uploaded (20)

Conversational Agents in Portuguese: A Study Using Deep Learning