alphablues - ML applied to text and image in chat bots
1. 1
Indrek
Vainu &
Hendrik
Luuk
ML applied
to
text and
image in
chat
bots
2. 2
Transform
customer
service into
a
truly
competitive
advantage
Utilize
AI to
automate, personalize
and scale
business
processes
3. 3
Chat
bots
are
intelligent
robots. They
are
able
to
infer what
you
want.
As
a
result,
they
respond with
appropriate
actions.
They
operate
autonomously i.e.
without
human
supervision.
7. 7
Short Bio
• >15y of programming, >10y research lab
• neuroscience PhD (2009, UT)
• Published papers in bioinformatics, neuroanatomy, general
linguistics
• full-stack engineer + data scientist
• ML expertise
• RoR, javascript, R, C++, python
8. 8
Applications of
ML
• Image similarity
• Video encoding and reconstruction
• Intent prediction from text
11. 11
Improved cost
function
for
reconstruction
• Autoencoder (AE) combined with Generative Adversarial Net
(GAN) is good at forming latent representations for accurate
reconstruction of image content
Autoencoding beyond
pixels
using
a
learned
similarity
metric
https://arxiv.org/pdf/1512.09300.pdf
12. 12
Improved cost
function
for
reconstruction
Autoencoding beyond
pixels
using
a
learned
similarity
metric
https://arxiv.org/pdf/1512.09300.pdf
13. 13
Improved cost
function
for
reconstruction
Autoencoding beyond
pixels
using
a
learned
similarity
metric
https://arxiv.org/pdf/1512.09300.pdf
14. 14
Improved cost
function
for
reconstruction
Autoencoding beyond
pixels
using
a
learned
similarity
metric
https://arxiv.org/pdf/1512.09300.pdf
15. 15
Improved cost
function
for
reconstruction
Autoencoding beyond
pixels
using
a
learned
similarity
metric
https://arxiv.org/pdf/1512.09300.pdf
16. 16
Improved cost
function
for
reconstruction
Autoencoding beyond
pixels
using
a
learned
similarity
metric
https://arxiv.org/pdf/1512.09300.pdf
17. 17
Improved cost
function
for
reconstruction
Autoencoding beyond
pixels
using
a
learned
similarity
metric
https://arxiv.org/pdf/1512.09300.pdf
25. 25
Create your
own
objective function
• intuition à equation
• Minimize the difference between prediction and target
• L(f(X), Y) à real number
• X – input
• f(X) – output from the model
• Y – target
• L – objective function
26. 26
Create your
own
objective function
• Figure out the derivative using sympy
27. 27
Create your
own
objective function
• Implement on your favourite platform
28. 28
Create your
own
objective function
• Plug in and enjoy!
CovCost CovCos
t
L2 L2
33. 33
Extending
memory of
RNN
• Sequence length == # of layers in unfolded RNN
• Vanishing gradient degrades memory
A
A
A
A
34. 34
Extending
memory of
RNN
• What is a good activation function?
• little saturation (mostly non-zero gradient)
• mean activation 0 (faster convergence)
Batch
Normalization:
Accelerating
Deep
Network
Training
by
Reducing
Internal
Covariate
Shift
https://arxiv.org/pdf/1502.03167.pdf
FAST
AND
ACCURATE
DEEP
NETWORK
LEARNING
BY
EXPONENTIAL
LINEAR
UNITS
(ELUS)
https://arxiv.org/pdf/1511.07289.pdf
35. 35
Extending
memory of
RNN
-4 -2 0 2 4
-1.00.00.51.0
fun(x)
x
y
-4 -2 0 2 4
0.00.40.8
Empirical derivative wrt x
x
edy
-4 -2 0 2 4
0.00.40.8
Analytical derivative wrt x
x
dy
TANH ELU
-4 -2 0 2 4
-101234
fun(x)
x
y
-4 -2 0 2 4
0.00.40.8
Empirical derivative wrt x
x
edy
-4 -2 0 2 4
0.00.40.8
Analytical derivative wrt x
x
dy
-4 -2 0 2 4
01234
fun(x)
x
y
-4 -2 0 2 4
0.00.40.8
Empirical derivative wrt x
x
edy
-4 -2 0 2 4
0.00.40.8
Analytical derivative wrt x
x
dy
RELU
36. 36
Extending
memory of
RNN
• Does it matter?
• sRNN convergence probability wrt length of sequence
Learning
long
term
dependencies
with
gradient
descent
is
difficult.
Bengio et
al.
1994
http://www.dsi.unifi.it/~paolo/ps/tnn-‐94-‐gradient.pdf