Machine Learning, Faster

Machine Learning, Faster
@neal_lathia
Machine Learning Lead

Monzo Chat
https://monzo.com/blog/2018/11/02/monzo-chat/

https://cloud.google.com/customers/monzo/

The main problems that we aim to solve with
machine learning include helping customers
find the right answers to their queries (in the
help screen of the app) and helping agents to
diagnose and respond to customer queries
swiftly (in the internal tooling).
Our most impactful model is an encoder based
on [1] that we train on chat data.
[1] Attention is all you need
https://arxiv.org/abs/1706.03762
Customer Operations

https://monzo.com/blog/2018/08/01/data-help/

How can we accelerate the
development of machine learning?
(1) Deploying, (2) Validing
(3) Reusing, (4) Templating

Deploying
From validated idea → production in < 1 day.

Quickly deploying models to
production is one of the biggest
roadblocks for impactful
machine learning.

https://monzo.com/blog/2016/09/19/building-a-modern-bank-backend/

What did we decide?
We created a tool to easily create a new
microservice. It included:
● A Python web server (Sanic)
● Deploying any kind of model (PyTorch,
Keras, Scikit-Learn)
● Selectively include add-ons, e.g. our
in-house model zoo library for NLP
● Command-line utilities for deploying
across the staging and production
environments.

Goal: if you can write a
predict() function, then you
can deploy a machine learning
model to production without
breaking anything.

Validating
Maintaining & debugging production models.

Quickly diagnosing minor issues
with machine learning models in
production is nearly impossible.

When I search for X, where is Y?

● Diagnosing this problem via unit or
integration tests did not work; revisiting the
model training was too slow.
● We added validation testing: making easy
predictions in production & validate that
they make the expected predictions.
● We get alerted when they fail. Most times,
it’s the pipeline, not the model!
Validation testing

Keeping track of the online
performance of machine
learning models is going beyond
what we traditionally do when
deploying software.

Reusing
> 1 feature from 1 model

Can we quickly reuse an existing
model to tackle a new problem?

Similar problem
● How can we redirect a subset of
conversations, based on their topic, to a
different queue?
● This is desperately needed to handle a high
volume of messages.
● Most of the research around this focuses
on transfer learning or fine-tuning.

Reusable solution
● We wrote a service that interacts with our
saved response recommender system --
but uses the recommendations to make a
queue assignment decision.
● Deployed this within less than a day &
used it to tackle a period of high inbound
demand on customer service.

Combining a rule engine over an
existing model creates a new
decision system. Rule engines &
ML can coexist!

Templating
Staying 10 steps behind the latest research

How can we quickly evaluate the
new state of the art in machine
learning?

2018 was transformative for
NLP
From shallow to deep language model pretraining
● Deep Contextualised Word Representations (ELMo, Feb 2018)
● Universal Language Model Fine-tuning for Text Classification
(ULMFit, May 2018)
● Pre-training of Deep Bidirectional Transformers for Language
Understanding (BERT, October 2018)
● … and more
http://ruder.io/nlp-imagenet/

● Completely split out the process of
generating clean, well-formatted, and
labelled text-based datasets for supervised
learning from any of the code that does the
learning itself.
● Created a number of plug-and-play Colab
notebooks for ULMFit, BERT (and PyText).
● Focus on time to results and common
requirements instead of specific prediction
problems. The most promising will be
taken forward later.
Approach research as an
exercise in creating templates

“To increase your success rate,
double your failure rate” (& get
to the same results in half the
time)

Speed & Machine Learning
● Deploying, validating, reusing, templating.
● Adopting the best practices from
engineering; tweaking the ones that do not
work for machine learning.
● Research time is well spent if we get some
tools (bonus: we also get some results).
● Always a work in progress!

Thanks!
@neal_lathia
https://monzo.com/careers/

Machine Learning, Faster

More Related Content

Similar to Machine Learning, Faster

More from Neal Lathia

Recently uploaded

Machine Learning, Faster