Why so many machine learning
models don’t make it to
production?
Dana Aonofriesei
800 employees, 8 locations
+136 million reviews on the platform
+580 websites with reviews
About
Some say that even 87% of the
Data Science projects
Source: venturebeat article, 19 July 2019
37% from 3000 surveyed leaders, had
deployed AI or would do shortly
Source: Gartner report, 21 January 2019
How do you find this?👍 👎
From 160 reviewed AI use cases, 88% did
not progress beyond the experimental
stage
Source: Mckinsey report, June 2017
What is the root cause?
1. ML quality & performance concerns
2. A more simple solution
3. Management support
4. Other
The Data
The Data
Data is an expensive asset:
1. Quality labelled data
2. Cleaned and consistent data
3. Data not accessible
Photo by Sharon McCutcheon on Unsplash
Labelled Data
1. Buy
2. Label in-house
3. Label externally
Available Data
1. Available but not accessible
2. Clean, processed, ready to use
Recommendation 👉
Invest in Data as a Service
Invest in Data as Service
Source: towardsdatascience.com article, 16 August 2020
Eat your own dog food:
Labelling system embedded in the existing flows
so users label data when using the product.
The Tools
The Tools
Data is an expensive asset:
1. Quality labelled data
2. Cleaned and consistent data
3. Data not accessible
Emily Potyraj (Watkins), January 2021
Program Manager at Microsoft. Streamlining data pipelines and scaling AI projects., 25 June 2020
B. The Tools
Data is an expensive asset:
1. Quality labelled data
2. Cleaned and consistent data
3. Data not accessible
Elements for ML system. Google Cloud, 17 August 2021
Paper: Hidden Technical Debt in Machine Learning Systems
The rise of MLOps
Continuous delivery for machine
learning systems
Recommendation 👉
Define MLOps components
needed or desired and invest in
developing those components
Define what ML tech debt
means.
The Process
The People
Article by Cristiano Breuel, Senior AI manager at Nubank, 13 July 2021
Software development process
is different from
ML development process
Sources: ProductPlan ,6 September 2021, Medium, March 2010
Software development: daily or weekly releases
vs
ML development: one time release, experimenting or data
discovery could take weeks
Engineers: builders. How do I build this?
vs
Data Scientists: How do I answer this?
How to bring Engineers and Data
Scientists together?
Recommendation 👉
Multi-disciplines teams:
One team: Data Scientist, Engineer, Designer,
Product Manager
Touch points:
RFC (Request for Comment), demos, reviews.
The People
What are your assumptions about the user
experience, model results and model
development?
Article by Taggart Bonham , AI Product Manager, January 2014
Recommendation 👉
Focus on user experience
How will the users interact with the predictions or the
outcome of the model?
✓ rate model complexity
✓ are all these features mandatory for this ML iteration?
✓ what data quality checks to consider?
✓ what is required for performing a dry-run?
✓ ...
ML development checklist
Upskill and facilitate learning
💰 Invest in Data as a Service
🧰 Define MLOps in your organisation
🤝 Find the harmony between ML and software development
💛 Upskill and facilitate learning
Summary

Why So Many ML Models Don't Make It To Production?

  • 1.
    Why so manymachine learning models don’t make it to production? Dana Aonofriesei
  • 2.
    800 employees, 8locations +136 million reviews on the platform +580 websites with reviews About
  • 3.
    Some say thateven 87% of the Data Science projects Source: venturebeat article, 19 July 2019
  • 4.
    37% from 3000surveyed leaders, had deployed AI or would do shortly Source: Gartner report, 21 January 2019 How do you find this?👍 👎
  • 5.
    From 160 reviewedAI use cases, 88% did not progress beyond the experimental stage Source: Mckinsey report, June 2017
  • 6.
    What is theroot cause?
  • 7.
    1. ML quality& performance concerns 2. A more simple solution 3. Management support 4. Other
  • 8.
  • 9.
    The Data Data isan expensive asset: 1. Quality labelled data 2. Cleaned and consistent data 3. Data not accessible Photo by Sharon McCutcheon on Unsplash
  • 10.
    Labelled Data 1. Buy 2.Label in-house 3. Label externally
  • 11.
    Available Data 1. Availablebut not accessible 2. Clean, processed, ready to use Recommendation 👉
  • 12.
    Invest in Dataas a Service
  • 13.
    Invest in Dataas Service Source: towardsdatascience.com article, 16 August 2020
  • 14.
    Eat your owndog food: Labelling system embedded in the existing flows so users label data when using the product.
  • 15.
  • 16.
    The Tools Data isan expensive asset: 1. Quality labelled data 2. Cleaned and consistent data 3. Data not accessible Emily Potyraj (Watkins), January 2021 Program Manager at Microsoft. Streamlining data pipelines and scaling AI projects., 25 June 2020
  • 17.
    B. The Tools Datais an expensive asset: 1. Quality labelled data 2. Cleaned and consistent data 3. Data not accessible Elements for ML system. Google Cloud, 17 August 2021 Paper: Hidden Technical Debt in Machine Learning Systems
  • 18.
    The rise ofMLOps Continuous delivery for machine learning systems Recommendation 👉
  • 19.
    Define MLOps components neededor desired and invest in developing those components
  • 20.
    Define what MLtech debt means.
  • 21.
  • 22.
    The People Article byCristiano Breuel, Senior AI manager at Nubank, 13 July 2021
  • 23.
    Software development process isdifferent from ML development process Sources: ProductPlan ,6 September 2021, Medium, March 2010
  • 24.
    Software development: dailyor weekly releases vs ML development: one time release, experimenting or data discovery could take weeks
  • 25.
    Engineers: builders. Howdo I build this? vs Data Scientists: How do I answer this?
  • 26.
    How to bringEngineers and Data Scientists together? Recommendation 👉
  • 27.
    Multi-disciplines teams: One team:Data Scientist, Engineer, Designer, Product Manager
  • 28.
    Touch points: RFC (Requestfor Comment), demos, reviews.
  • 29.
  • 30.
    What are yourassumptions about the user experience, model results and model development? Article by Taggart Bonham , AI Product Manager, January 2014 Recommendation 👉
  • 31.
    Focus on userexperience How will the users interact with the predictions or the outcome of the model?
  • 32.
    ✓ rate modelcomplexity ✓ are all these features mandatory for this ML iteration? ✓ what data quality checks to consider? ✓ what is required for performing a dry-run? ✓ ... ML development checklist
  • 33.
  • 34.
    💰 Invest inData as a Service 🧰 Define MLOps in your organisation 🤝 Find the harmony between ML and software development 💛 Upskill and facilitate learning Summary