Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensorflow AI Models: From Jupyter Notebook to NetflixOSS Microservice

1,671 views

Published on

In this talk, I describe some recent advancements in Streaming ML and AI Pipelines to enable data scientists to rapidly train and test on streaming data - and ultimately deploy models directly into production on their own with low friction and high impact.

With proper tooling and monitoring, data scientist have the freedom and responsibility to experiment rapidly on live, streaming data - and deploy directly into production as often as necessary. I’ll describe this tooling - and demonstrate a real production pipeline using Jupyter Notebook, Docker, Kubernetes, Spark ML, Kafka, TensorFlow, Jenkins, and Netflix Open Source.

Published in: Software
  • Be the first to comment

Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensorflow AI Models: From Jupyter Notebook to NetflixOSS Microservice

  1. 1. BIG DATA SPAIN 2016 Continuously Deploy ML and AI Models: From Notebook to Microservice ThankYou, Madrid! Chris Fregly, Research Scientist @
  2. 2. WHO AM I? Chris Fregly -------- Research Scientist @ PipelineIO (Formerly Netflix and Databricks) -------- http://pipeline.io
  3. 3. WORKSHOP ON SATURDAY, NOV 19TH HERE IN MADRID!! http://pipeline.io
  4. 4. SOURCE CODE AND DOCKER IMAGES • Github Repo: ~900 Stars, ~300 Forks • DockerHub Repo: ~ 6,000 Pulls
  5. 5. WHAT IS PIPELINE.IO? ExtendingYour ML Pipelines into Production 100% Open Source! http://pipeline.io
  6. 6. BRAINSTORMING AND VALIDATING • Major Gaming Company • Large Ride Sharing Service • Popular Q & A Site • Online Clothing Retailer • DominantVideo Streaming
  7. 7. PIPELINE.IO FOCUS • Model Deploying andTesting • Model Scaling and Serving • Online ModelTraining • Native Code Generation
  8. 8. MODEL DEPLOYING AND TESTING Continuously Test and Deploy Models in Production!
  9. 9. MODEL SCALING AND SERVING
  10. 10. ONLINE MODEL TRAINING • Continuous, Incremental, and Partial Training • Kafka + Spark Streaming + Spark ML • Real-time, Dynamic Recommendations
  11. 11. NATIVE CODE GENERATION Generate Optimized Code from Spark ML!
  12. 12. BECOME A CONTRIBUTOR!
  13. 13. PIPELINE.IO FOCUS FOR 2017 • Performance, Performance, Performance • Native Code Generation: CPU + GPU • More Global Contributors!
  14. 14. WE’RE HIRING!! • Kafka, Spark ML, and TensorFlow Contributors • Systems Engineers • GPU/CUDA Engineers • C++, Java, Scala, Python WE ONLY HIRE NICE PEOPLE!!
  15. 15. DEMO TIME!
  16. 16. DEMO: NETFLIX-BASED MICROSERVICES Circuit Breakers and Request Batching
  17. 17. DEMO: DEPLOY NOTEBOOK TO PROD Deploy to Cloud or On-Premise!
  18. 18. DEMO: NATIVE CODE GENERATION
  19. 19. DEMO: TENSORFLOW SERVING
  20. 20. THANK YOU FOR CHOOSING ME! http://pipeline.io

×