ML Readiness: A 7-Step Guide

On Machine Learning
Readiness
Anne-Marie Tousch
WiMLDS Paris Meetup
December 1st 2022

Quick bio
computer vision
(PhD)
computer vision
(startup)
ML (RecSys, …)
2020-?: AIOps
3
More Machine Learning More Software Engineering
2006
2010
2014
2020
?
?
?
?

What is Machine
Learning, really?
7

Classical software
Use algorithms to process data.
8

Classical software
9
This is
AI :-)

Classical Software
Use algorithms to process (more) data
10

Classical Software
Eg: Fetch books
with best ratings
11

Machine Learning: so what's different?
The function is generated from the data
12

Machine Learning
Use a database of
games played.
13

The classic software engineering System
15
Code = clear contract
Avoid failures with:
● Unit tests,
regression
tests…
● Version control,
code reviews…
● Monitoring

The ML Software System
16
Data processing becomes key.
Plumbing grows.
Sculley et al. Hidden technical debt in machine learning systems, Neurips 2015.
Code = weak
contract

24
Should you use Machine Learning?
Are you reaching the limits of your
current system?
- Too many rules to handle?
- Too complex to create rules?
- No mathematical solution?
What are the consequences of a wrong prediction?

1. Quantify your performance
Design metrics that measure how good your output is.
26
● Sequences of moves: game won?
● Individual moves: proxy score

27
✅ or ❌?
The ideal you want to optimize
What you can measure

You may not be able to compute the same metrics online and
offline.
28
Number of games won, engagement… Prediction accuracy,
reconstruction accuracy…

2.Have a strong non-ML baseline
Your current algorithm is a good candidate.
29

3. Collect Data
- Lots of data
- Store it!
- Use a data-analysis
friendly format
- Collect labels!
30

4. Analyse and Clean your Data
- Missing data?
- Outliers?
- Failure modes?
- Change over time?
Are there some obvious patterns
that emerge?
31
🔍

5. Run Offline Experiments
Improve your baseline. Change the parameters. Use more inputs.
Still not ML but now you can measure:
- How the output distribution changes
- How the performance changes
32

6. Prepare for changes in production
Can you compare 2 versions of
your function in a production
setting?
- Shadow pipeline
- Blue/green testing
- A/B tests
33

7. Validate Offline/Online agreement
34
34
✅ or ❌?
What you measure online
What you optimized offline

35
35
✅ or ❌?

36
36
✅ or ❌?
?

37

- Put the new baseline to production
- Measure the outcome
Does it match your expectations?
- Yes: congratulations! You're good
to go and have fun with shiny ML
models (OK not so fast, I'm
joking).
- No
38
?

If online results don't match
offline expectations:
- Are you sure your metrics are
good?
- Is your offline data good
enough?
- Are there hidden feedback
loops skewing the data?
39
offline
online

41
It's only the beginning… Brace yourself!
- Reframe as a machine
learning problem?
- Add more data processing?
- …
- Enter the world of MLOps

Remember…
44
- Machine learning engineering is
not exactly like software
engineering
- Think data. CLEAN data.
- No-ML can be a strong baseline
and can be OK.
- Evaluation is key – you need to be
able to evaluate the
cost/performance tradeoff.
- You CAN anticipate your ML future

Resources
Machine learning engineering in Action, Ben Wilson
Machine Learning Engineering, Andriy Burkov
Designing Machine Learning Systems, Chip Huyen
MLOps ZoomCamp, Youtube series
45

References
- Leon Bottou's keynote at ICML'15
- Software 2.0, blog post by Andrej Karpathy (2017)
- Sculley et al. Hidden technical debt in machine learning
systems, Neurips 2015.
- Amershi et al. Software engineering for machine learning:
A case study ICSE-SEIP 2019.
- Biswas et al. The art and practice of data science
pipelines: A comprehensive study of data science
pipelines in theory, in-the-small, and in-the-large ICSE
2022.
46

Thanks!
Questions?
47
Illustrations from DALL-E
Drawings made with https://excalidraw.com/

ML Readiness: A 7-Step Guide

Recommended

Recommended

More Related Content

Similar to ML Readiness: A 7-Step Guide

Similar to ML Readiness: A 7-Step Guide (20)

Recently uploaded

Recently uploaded (20)

ML Readiness: A 7-Step Guide