1) The document discusses research on developing a conversational agent in Portuguese using deep learning techniques. It provides background on chatbots and machine learning approaches.
2) The researcher collected various corpora in Portuguese to train sequence-to-sequence models for question answering. Tests were run using different corpora and model architectures.
3) Preliminary results from user tests with 90 participants were positive based on 1,502 conversation pairs, though the researcher identifies areas for future improvement including using more advanced techniques and evaluating on additional Portuguese data.
6. Chatbots timeline
60's: Eliza
Simulation of
rogerian
psychologist. Use
pieces of input to
build output phrase
(Weizenbaum)
90's: Julia
TinyMUD bot. Based
on rules recognition
and neural networks
(Mauldin)
2000: Alice
Pattern matching.
Inspired by XML:
AIML (Wallace)
Neural Networks
8. Chatbots - relevance
◉ Its abilities to dialog in natural language make
them attractive
◉ They could be available 24x7 and provides a
more natural interface for apps
◉ Big companies like Apple, Microsoft, Samsung,
Amazon and Google has been launching
products in this direction
10. Caveats (or not)
◉ Handcrafted rules
◉ Unknown questions will generate default
answers
◉ Manual effort to build and maintain the
conversation's database over the time
○ Are there alternatives?
13. “
A learning definition:
"A computer program is said to learn from experience E with respect
to some class of tasks T and performance measure P if its
performance at tasks in T, as measured by P, improves with
experience E" (Mitchel, 1997)
14. “
A machine learning definition:
It's a usage of computational resources and learning algorithms
over a dataset to infer mathematical functions that represent the
dataset's information (Bengio and Courville, 2016)
15. “
A deep learning definition:
The main difference between a shallow learning and deep
learning it's a deep mathematical function composition. (Bengio
and Courville, 2016)
17. Deep Learning timeline
40's: Artificial
Neuron
It's is possible to use
a model based on
neuron to do logic
calculations.
(McCulloch and Pitts,
1943)
60's: Perceptron
Applied weights ideia
to artificial neuron
model. (Rosenblatt,
1958)
80's: XOR Problem and
Multilayer Perceptron
Perceptron works only for
linear problems (Minsky
and Pappert, 1988)
XOR problem resolved by
backpropagation algorithm
(Rumelhart at all, 1985)
Neural Networks
18. Deep Learning timeline
90's: Cyclical
Connections
Relaxed feedforward
conditions and allow
cyclical connections
on sequences over
the time (Graves,
2012)
90's: LSTM
Brings the gates to
address vanishing
and explosion
gradients problem
(Hochreiter, 1997)
2014: Sequence to
Sequence Model
Applied LSTM layers to
translation tasks between
english and french with
good results (Sutskever et
all, 2014)
Neural Networks
25. Research
● Most work was made based on AIML
● All related papers are for english
● Lack of portuguese dialog corpora
● Lack of libraries and frameworks
○ Just one in lua! (neuralconvo)
26. Research - Corpora
● Cornell Movie-Dialogs Corpus (english)
○ 220,579 sentences
● OpenSubtitles (portuguese)
○ 2.6 billion of sentences!
○ Starwars Movie subtitles
■ 12,775 sentences
● Mobile App chat history (portuguese)
○ 30,684 sentences
35. ● Corpus: 10,000 sentences
● Autoencoder: 1,000 units per layer
Research - English Corpus
Tests
36. Research - Portuguese Corpus
Tests
● Corpus: 10,000 sentences
● Autoencoder: 1,000 units per layer
37. Research - StarWars Corpus
(TecnoPUC tests)
● Corpus: 12,775 sentences
● Autoencoder: 400 units per layer
○ - 600 units to reduce costs
38. Research - Mobile App Chat
Tests
● Corpus: 30,684 sentences
● Autoencoder: 1,000 units per layer
● Personality!
39. Research - Turn taking
● Since turn taking is not clearly identified,
mainly at subtitles, we trained our model to
predict the next sentence given the previous
one (Sutskever at all, 2015)
● Word generation stop at 20 words length
41. Results
● 90 users exchange has been used the bot
● 1,502 conversation pairs
● Users from:
○ Facebook
○ Software Engineering Classes
○ Coworkers
● 1 week test
● Users age between 17 and 62 years old
50. Final Considerations
● The results using movie subtitles, considering
the computational capacity used for and lack
of other AI techniques, were satisfactory.
● We have made an experiment using chat app
history and results was similar to Sutskever
and Vinyals results
○ Personality traits!
52. Future Work
● Try to use more AI techniques like
○ Dimensionality reduction
○ Applying some NPL preprocess
○ More hidden layers and so on..
● Try another turn taking heuristic
● Experiment new frameworks like PyTorch
● Evaluate it with other corpora in portuguese
53. Any questions ?
You can find me at
◉ @maedabr
◉ maeda.br@gmail.com
◉ amaeda@thoughtworks.com
◉ andherson.maeda@acad.pucrs.br
Thanks!
http://bagu.al/1Bn
Paper published at KDMiLe