Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Yaroslav Ravlinko "Build your own Machine Learning Platform or how to develop, test and release ML/AI solutions in live environment (DevOps)"


Published on

Data Science Practice

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Yaroslav Ravlinko "Build your own Machine Learning Platform or how to develop, test and release ML/AI solutions in live environment (DevOps)"

  1. 1. Build your own Machine Learning Platform Yaroslav Ravlinko Solution Architect Survival guide for Data Science Engineer
  2. 2. 2 About me • I'm Solution Architect at Grid Dynamics • I’m working in IT industry for more than 10 years and delivered more than 50 projects in different domains. • Working with “big” data production since 2013 • Tech agnostic • Really know why DevOps is not a name of position and why unicorpses live Yaroslav Ravlinko, Grid Dynamics, Lviv, Ukraine
  3. 3. 3 Data Processing, Machine Learning and Business • Recommendation Engines for consumers • Anomaly Detections (security) • Chat bots • Visual Recognition (Inventory, Product Catalog and Search)
  4. 4. 4 What will be not discussed here • Machine Learning and Data Processing services for AWS/Azure/GCP Because that’s other long long story so I’m keeping it for the next talk • Specific implementation of some algorithm Because I’m not so good in this • Specific “DevOps” of “Data Science” toolsets Because your “hammer” is your toy so knock yourself out
  5. 5. 5 What will be discussed here • Review of typical problems that potentially you will face People tend to make the same mistakes • What Data Processing/ML “ecosystem” is right for you Short guideline into existing platform • Review some solutions that can help you to avoid mistakes and save money Because we already did those mistakes
  6. 6. 6 You are …
  7. 7. 7 Machine Learning Workflow
  8. 8. 8 Machine Learning Ecosystem (One of it)
  9. 9. 9 Machine Learning “Challenges” • Diverse and ever growing zoo of technologies for Data scientists and ML engineers (Python, R, Scala, Java, Spark/PySpark, TF Cluster, Python/Flask, TF Serving, ElasticSearch, Hadoop, Cassadra … and so on and on). • Reproducible environments on any scale (local, sandbox, prod). “It works on my machine” syndrome still vastly present in our industry. • “One size fits all” approach. Because when you mastered hammer everything around become nails.
  10. 10. 10 Why those are challenges? • Cost of development What is the price of entry into this • Resource Management and Financial/Cost Management Every minute that you system is not working it costs you money • Production DevOps (delivery), SRE (availability and SLA)
  11. 11. 11 Assessment Criteria
  12. 12. 12 Decision tree
  13. 13. 13 Solutions
  14. 14. 14 Hadoop Ecosystem
  15. 15. 15 Mesos ecosystem
  16. 16. 16 Kubernetes ecosystem
  17. 17. 17 Example of platform
  18. 18. 18 Recommender based on Alternating Least Squares (ALS) algorithm *
  19. 19. 19 Development and Feature Engineering
  20. 20. 20 Jupyther Hub as DataLab on-premise
  21. 21. 21 ETL, Train and Serve Recommender
  22. 22. 22 Delivery Pipeline With Spark, HDFS, HBase, Python and a lot of Scala
  23. 23. 23 Serving with re-training
  24. 24. 24 Implementation
  25. 25. 25 Demo
  26. 26. 26 At the end • Always concentrate at Value = Benefits - Cost You don’t need aircraft carrier to deliver sofa but don’t rely on bicycle either • Don’t pay for things that you aren’t using If you want some service but not ready to pay it from your own pocket - you don’t really need it • Production Start thinking how your code will be working in real environment
  27. 27. Questions? 27
  28. 28. Founded in 2006, Grid Dynamics is an engineering services company built on the premise that cloud computing is disruptive within the enterprise technology landscape. Since that time, we’ve had the privilege to help companies like Microsoft, eBay, PayPal, Cisco, Macy’s, Yahoo, ING, Bank of America, Kohl's, among others, to re-architect their core mission-critical systems, develop new cloud services, accelerate innovation cycles, increase software quality, and automate application management. Grid Dynamics has multiple locations in the USA and Europe, and employs over 1000 expert engineers worldwide. About Grid Dynamics 28
  29. 29. Thank you!