As a machine learning practitioner, you probably have met people asking the question: how can I use machine learning to solve my problem? In this talk, we'll present a few of the challenges of setting up a machine learning pipeline in the real world. We'll explain why it is fundamentally different from a typical software engineering pipeline. And we'll (try to) give a few best practices to help software engineers "think ML" and prepare their collaboration with data scientists.
Recording: https://youtu.be/TZOWthpeqUY?si=MxQfT9FhPSx7fc1X&t=481
15. The classic software engineering System
15
Code = clear contract
Avoid failures with:
● Unit tests,
regression
tests…
● Version control,
code reviews…
● Monitoring
16. The ML Software System
16
Data processing becomes key.
Plumbing grows.
Sculley et al. Hidden technical debt in machine learning systems, Neurips 2015.
Code = weak
contract
24. 24
Should you use Machine Learning?
Are you reaching the limits of your
current system?
- Too many rules to handle?
- Too complex to create rules?
- No mathematical solution?
What are the consequences of a wrong prediction?
26. 1. Quantify your performance
Design metrics that measure how good your output is.
26
● Sequences of moves: game won?
● Individual moves: proxy score
27. 1. Quantify your performance
27
✅ or ❌?
The ideal you want to optimize
What you can measure
28. 1. Quantify your performance
You may not be able to compute the same metrics online and
offline.
28
Number of games won, engagement… Prediction accuracy,
reconstruction accuracy…
29. 2.Have a strong non-ML baseline
Your current algorithm is a good candidate.
29
30. 3. Collect Data
- Lots of data
- Store it!
- Use a data-analysis
friendly format
- Collect labels!
30
31. 4. Analyse and Clean your Data
- Missing data?
- Outliers?
- Failure modes?
- Change over time?
Are there some obvious patterns
that emerge?
31
🔍
32. 5. Run Offline Experiments
Improve your baseline. Change the parameters. Use more inputs.
Still not ML but now you can measure:
- How the output distribution changes
- How the performance changes
32
33. 6. Prepare for changes in production
Can you compare 2 versions of
your function in a production
setting?
- Shadow pipeline
- Blue/green testing
- A/B tests
33
38. 7. Validate Offline/Online agreement
- Put the new baseline to production
- Measure the outcome
Does it match your expectations?
- Yes: congratulations! You're good
to go and have fun with shiny ML
models (OK not so fast, I'm
joking).
- No
38
?
39. 7. Validate Offline/Online agreement
If online results don't match
offline expectations:
- Are you sure your metrics are
good?
- Is your offline data good
enough?
- Are there hidden feedback
loops skewing the data?
39
offline
online
44. Remember…
44
- Machine learning engineering is
not exactly like software
engineering
- Think data. CLEAN data.
- No-ML can be a strong baseline
and can be OK.
- Evaluation is key – you need to be
able to evaluate the
cost/performance tradeoff.
- You CAN anticipate your ML future
45. Resources
Machine learning engineering in Action, Ben Wilson
Machine Learning Engineering, Andriy Burkov
Designing Machine Learning Systems, Chip Huyen
MLOps ZoomCamp, Youtube series
45
46. References
- Leon Bottou's keynote at ICML'15
- Software 2.0, blog post by Andrej Karpathy (2017)
- Sculley et al. Hidden technical debt in machine learning
systems, Neurips 2015.
- Amershi et al. Software engineering for machine learning:
A case study ICSE-SEIP 2019.
- Biswas et al. The art and practice of data science
pipelines: A comprehensive study of data science
pipelines in theory, in-the-small, and in-the-large ICSE
2022.
46