6. Houston! We have a problem (again)
• Quotation #1:
“Once we have a great model, we used it X times, than it’s
performance felt down. We do not use it anymore…”
Problem is models get corrupted with time and…data.
i.e. AI/ML has not Data Immunity !!!
Global Data Science Conference 2017
7. Houston! We have a problem (again)
Global Data Science Conference 2017
Usual Software System:
Stays intact after deployment
(i.e. no functional changes w.r.t. data)
8. Houston! We have a problem (again)
• Quotation #2:
“We have a great team, they’ve built great models, but we need
to process X million rows per (sec, min, …)”
Problem is tech./infra.
used is not enough.
Global Data Science Conference 2017
9. Houston! We have a problem (again)
Data Infrastructures getting complicated:
- Too many components
- Diverse characteristics of
data
- Configuration Mgmt.
- Re-engineering of models
- …
Global Data Science Conference 2017
10. Houston! We have a problem (again)
• Quotation #3:
“We invest into XYZ tool, but we can’t use it effectively,
because we need to export manually each time we need it”
Problem is integrability.
Global Data Science Conference 2017
11. Houston! We have a problem (again)
- Data Inconsistency (format, etc.)
- Non-standard API’s
- Incompatible API’s
- …
Global Data Science Conference 2017
12. Houston! We have a problem (again)
• Quotation #4:
“Each time it takes X hrs. to produce results, because we
do it manually/it does not scale”
Problem is scalability.
Global Data Science Conference 2017
13. Houston! We have a problem (again)
- Bootstrap/Cold start issues
- Data hose coupling/de-coupling
- …
Global Data Science Conference 2017
14. Top 5 Ideas to Steal
• Idea #1: Use basic DevOps cycle
Global Data Science Conference 2017
, but be careful!
15. Top 5 Ideas to Steal
• Idea #2: De-couple API/Model
Global Data Science Conference 2017
16. Top 5 Ideas to Steal
• Idea #3: Use schedulers & containers
Global Data Science Conference 2017
17. Top 5 Ideas to Steal
• Idea #4: Consider Re-writing (or don’t stick to a framework)
Global Data Science Conference 2017
18. Top 5 Ideas to Steal
• Idea #5: Automatize!
Global Data Science Conference 2017
19. Summary
1. Production/Business use of AI/ML is different than Academics
or Competition focus.
2. Focus on how to keep models up and running
3. Remember Data Immunity Problem
4. API/Model decoupling is important
5. Adopt best practices (already established, e.g. DevOps)
6. Automatize
Global Data Science Conference 2017
20. Shameless Self Promotion
If you want to try out, let me know.
ekrem@hiddenslate.com
Global Data Science Conference 2017