7. The main problems that we aim to solve with
machine learning include helping customers
find the right answers to their queries (in the
help screen of the app) and helping agents to
diagnose and respond to customer queries
swiftly (in the internal tooling).
Our most impactful model is an encoder based
on [1] that we train on chat data.
[1] Attention is all you need
https://arxiv.org/abs/1706.03762
Customer Operations
13. What did we decide?
We created a tool to easily create a new
microservice. It included:
● A Python web server (Sanic)
● Deploying any kind of model (PyTorch,
Keras, Scikit-Learn)
● Selectively include add-ons, e.g. our
in-house model zoo library for NLP
● Command-line utilities for deploying
across the staging and production
environments.
14. Goal: if you can write a
predict() function, then you
can deploy a machine learning
model to production without
breaking anything.
18. ● Diagnosing this problem via unit or
integration tests did not work; revisiting the
model training was too slow.
● We added validation testing: making easy
predictions in production & validate that
they make the expected predictions.
● We get alerted when they fail. Most times,
it’s the pipeline, not the model!
Validation testing
19. Keeping track of the online
performance of machine
learning models is going beyond
what we traditionally do when
deploying software.
23. Similar problem
● How can we redirect a subset of
conversations, based on their topic, to a
different queue?
● This is desperately needed to handle a high
volume of messages.
● Most of the research around this focuses
on transfer learning or fine-tuning.
24. Reusable solution
● We wrote a service that interacts with our
saved response recommender system --
but uses the recommendations to make a
queue assignment decision.
● Deployed this within less than a day &
used it to tackle a period of high inbound
demand on customer service.
25. Combining a rule engine over an
existing model creates a new
decision system. Rule engines &
ML can coexist!
27. How can we quickly evaluate the
new state of the art in machine
learning?
28. 2018 was transformative for
NLP
From shallow to deep language model pretraining
● Deep Contextualised Word Representations (ELMo, Feb 2018)
● Universal Language Model Fine-tuning for Text Classification
(ULMFit, May 2018)
● Pre-training of Deep Bidirectional Transformers for Language
Understanding (BERT, October 2018)
● … and more
http://ruder.io/nlp-imagenet/
29. ● Completely split out the process of
generating clean, well-formatted, and
labelled text-based datasets for supervised
learning from any of the code that does the
learning itself.
● Created a number of plug-and-play Colab
notebooks for ULMFit, BERT (and PyText).
● Focus on time to results and common
requirements instead of specific prediction
problems. The most promising will be
taken forward later.
Approach research as an
exercise in creating templates
30. “To increase your success rate,
double your failure rate” (& get
to the same results in half the
time)
32. Speed & Machine Learning
● Deploying, validating, reusing, templating.
● Adopting the best practices from
engineering; tweaking the ones that do not
work for machine learning.
● Research time is well spent if we get some
tools (bonus: we also get some results).
● Always a work in progress!