SlideShare a Scribd company logo
1 of 37
Download to read offline
Data Science in Production
How Docker, K8s, and Airflow Drive
Adoption of Data Science Solutions at
JW Player
Nir Yungster
For this talk, we’ll cover technical approaches that
can help drive adoption of data science solutions...
But technology is not a remedy for everything
(people, process,…)
Disclaimer!
Applied Data Science Is Driving Innovation
Across Industries
A Perhaps Familiar
Story
Our Model
Our Users
Data Science Requires Three Pieces to Succeed
1. Access to data
2. Effectiveness in research, development of solutions
3. Ability to deliver solutions when and where they’re needed
Part I: The Challenge of Data Science in Production
Part II: The Data Science Platform at JW Player
Part III: Data Science in Production at JW Player
1
2
3
Agenda
The Challenge of Data Science in
Production
Part I
● Model Performance
○ E.g. accuracy, precision, etc
● Production-Level Code
○ Portability
○ Maintainability
○ Scalability
○ Reliability
What Does Production Data Science Mean?
— Ease of deploying across environments
— Testing, monitoring, documentation
— Ability to handle high traffic volume
— Service up-time
Solution: Scientists and Engineers Collaborating
Scientists and Engineers Collaborating
I want model
performance! I want model
performance!
I want accuracy,
interpretability,
& validation!!
I want model
performance! I want model
performance!
Scientist Engineer
I want efficiency,
reliability, &
SLEEP!!
Collaboration: The Good, the Bad, and The Ugly
● The Good
○ Positive collaboration
○ Both sides primary goals achieved
● The Bad
○ Models in Limbo
○ Mutant models
● The Ugly
○ Misunderstanding, distrust
○ Barriers between teams
There are tools that can help!
● To make production data science more feasible
● To make Data Science teams more self sufficient
● To enable better collaboration across teams
The Data Science Platform at
JW Player
Part II
About JW Player
● Video player + platform
● Headquarters in NYC
● SaaS business
● 15k subscribers, 2M free
● 5% of video plays across the web
● Video Recommendation Engine
Video Publisher Data
Products
● Automated Thumbnail Selection
● Shot/Scene Detection
● Provide R&D for data products
● Centralized team (6 members)
○ Including 2 software developers
● Work with a variety of product and
engineering teams across the
company
Data Science Within JW Player
Key Elements of JW Data Science Infrastructure
Container Service Workflow Orchestration Application Orchestration
Scalability, Reliability
Portability Maintainabiilty
Docker is a Container Service
What’s a container?
● A standard wrapper for
tasks & applications so
that they run consistently
across environments
● Applications / tasks can run in any
environment
● Removes friction arising from
development and deployment in
different environments
○ Across teams, within teams
Container Portability Reduces Integration Pain
dockerize all the things!
Airflow Orchestrates Workflows
● Workflow consist of a series of tasks
○ E.g. data processing, model training
○ Workflows run on a schedule
● Airflow helps with Maintainability
○ Monitoring & alerting
○ Web interface for investigating logs,
rerunning tasks / entire workflows
● Deploy & manage dockerized
applications that run continuously (e.g.
an API service)
● Built-in Scaling, Reliability, Monitoring
● JW Player maintains an internal
deployment service powered by
Kubernetes
Kubernetes Orchestrates Applications
Kubernetes
Master Node
Worker
Node
Kubernetes Basics
Worker
Node
Worker
Node
Worker
Node
● Application scaling made easy
○ Choose number of replicas
○ Scaling up is a configuration change
1 1
2
2
2
App.yaml
Pod-1:
Replicas: 2
Pod-2:
Replicas: 3
2
● Reliability
○ Master node monitors system
○ Ensures correct number of replicas
Key Elements of JW Data Science Infrastructure
Container Service Workflow Orchestration Application Orchestration
Scalability, Reliability
Portability Maintainabiilty
Data Science in Production
at JW Player
Part III
Three flavors of production data science
● Backend Microservices
○ Server-side API Running in Kubernetes
● Plugins (aka Frontend microservices)
○ Client-side plugin running alongside the Player
● “Integrations” with engineering
○ Data Science conducts R&D, develops a model
○ Works with Engineering to productionize
Backend Microservice
● What is involved?
○ Deploy model as application on Kubernetes
○ Backend service with API
● When is this approach common?
○ Easiest for a new model
● Benefits
○ Data Science in full control of model, updates
○ Decoupled architecture
○ Clear ownership, boundaries
Backend
Frontend
Microservice
Client-side Plugin
● What is involved?
○ Effectively a client-side microservice
○ Written in JavaScript
● When is this approach common?
○ Easiest for a new model
○ If the model is lightweight
● Benefits
○ Decoupled architecture
○ Reduced network traffic, low latency
Backend
Frontend
Plugin
● What is involved?
○ Translating / integrating model
○ Requires very close coordination
○ Often involves rewriting model code
● When is this approach common?
○ Often the case when you’re
improving upon an existing product
Integration with Engineering
Backend
Frontend
Model
??
● Possible Pitfalls
○ Tangled web
○ Unclear path to update/iterate
Final Thoughts
Some Takeaways
● Owning models means more maintenance responsibility
○ Can take away from core DS mission
● Microservices don’t remove need to collaborate with
other teams on models
○ To ensure feature fidelity
○ Ensure proper usage
○ SLAs
● Think about production from the beginning of R&D
● Build intelligent fallbacks to ease reliability concerns
○ When one element of a service fails, allowing for slightly
degraded state (e.g. serving a stale model)
● Build a microservice that you jointly maintain with engineers
● Consider if your next hire should be a software engineer
Some Tips
Acknowledgement
Graham Edge
Nil Timor
Olga Minkina
Rik Heijdens
Rob van Ejik

More Related Content

Similar to Data Science in Production: Technologies That Drive Adoption of Data Science Solutions at JW Player

Develop, deploy, and operate services at reddit scale oscon 2018
Develop, deploy, and operate services at reddit scale   oscon 2018Develop, deploy, and operate services at reddit scale   oscon 2018
Develop, deploy, and operate services at reddit scale oscon 2018Gregory Taylor
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Programaspyker
 
Building a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixBuilding a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixAll Things Open
 
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...Demi Ben-Ari
 
DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsFedir RYKHTIK
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...Haggai Philip Zagury
 
The journey to Native Cloud Architecture & Microservices, tracing the footste...
The journey to Native Cloud Architecture & Microservices, tracing the footste...The journey to Native Cloud Architecture & Microservices, tracing the footste...
The journey to Native Cloud Architecture & Microservices, tracing the footste...Mek Srunyu Stittri
 
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Ambassador Labs
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015aspyker
 
What's New in Rundeck 3.4
What's New in Rundeck 3.4   What's New in Rundeck 3.4
What's New in Rundeck 3.4 Rundeck
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the tradeFangda Wang
 
How to Choose an Integration Platform Vendor for Your Business
How to Choose an Integration Platform Vendor for Your BusinessHow to Choose an Integration Platform Vendor for Your Business
How to Choose an Integration Platform Vendor for Your BusinessWSO2
 
Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...Wit Jakuczun
 
CHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopCHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopObject Automation
 
George Grey Welcome Keynote - BUD17-100K1
George Grey Welcome Keynote - BUD17-100K1George Grey Welcome Keynote - BUD17-100K1
George Grey Welcome Keynote - BUD17-100K1Linaro
 
Micro Front-End & Microservices - Plansoft
Micro Front-End & Microservices - PlansoftMicro Front-End & Microservices - Plansoft
Micro Front-End & Microservices - PlansoftMiki Lombardi
 
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebula Project
 
Summit 16: NetIDE: Integrating and Orchestrating SDN Controllers
Summit 16: NetIDE: Integrating and Orchestrating SDN ControllersSummit 16: NetIDE: Integrating and Orchestrating SDN Controllers
Summit 16: NetIDE: Integrating and Orchestrating SDN ControllersOPNFV
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science PlatformDecision Science Community
 

Similar to Data Science in Production: Technologies That Drive Adoption of Data Science Solutions at JW Player (20)

Develop, deploy, and operate services at reddit scale oscon 2018
Develop, deploy, and operate services at reddit scale   oscon 2018Develop, deploy, and operate services at reddit scale   oscon 2018
Develop, deploy, and operate services at reddit scale oscon 2018
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Program
 
Building a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixBuilding a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at Netflix
 
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
 
DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and Projects
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
 
The journey to Native Cloud Architecture & Microservices, tracing the footste...
The journey to Native Cloud Architecture & Microservices, tracing the footste...The journey to Native Cloud Architecture & Microservices, tracing the footste...
The journey to Native Cloud Architecture & Microservices, tracing the footste...
 
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015
 
What's New in Rundeck 3.4
What's New in Rundeck 3.4   What's New in Rundeck 3.4
What's New in Rundeck 3.4
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
How to Choose an Integration Platform Vendor for Your Business
How to Choose an Integration Platform Vendor for Your BusinessHow to Choose an Integration Platform Vendor for Your Business
How to Choose an Integration Platform Vendor for Your Business
 
Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...
 
CHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopCHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshop
 
George Grey Welcome Keynote - BUD17-100K1
George Grey Welcome Keynote - BUD17-100K1George Grey Welcome Keynote - BUD17-100K1
George Grey Welcome Keynote - BUD17-100K1
 
Micro Front-End & Microservices - Plansoft
Micro Front-End & Microservices - PlansoftMicro Front-End & Microservices - Plansoft
Micro Front-End & Microservices - Plansoft
 
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
 
Summit 16: NetIDE: Integrating and Orchestrating SDN Controllers
Summit 16: NetIDE: Integrating and Orchestrating SDN ControllersSummit 16: NetIDE: Integrating and Orchestrating SDN Controllers
Summit 16: NetIDE: Integrating and Orchestrating SDN Controllers
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
 

Recently uploaded

Leading-edge AI Image Generators of 2024
Leading-edge AI Image Generators of 2024Leading-edge AI Image Generators of 2024
Leading-edge AI Image Generators of 2024SOFTTECHHUB
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiMonica Sydney
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...meghakumariji156
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoilmeghakumariji156
 
一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理F
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdfMatthew Sinclair
 
Research Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptxResearch Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptxi191686
 
Down bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirtsrahman018755
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样ayvbos
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...kumargunjan9515
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...kumargunjan9515
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsMonica Sydney
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge GraphsEleniIlkou
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfJOHNBEBONYAP1
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsMonica Sydney
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasDigicorns Technologies
 
一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书F
 

Recently uploaded (20)

Leading-edge AI Image Generators of 2024
Leading-edge AI Image Generators of 2024Leading-edge AI Image Generators of 2024
Leading-edge AI Image Generators of 2024
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
 
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
 
一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
Research Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptxResearch Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptx
 
Down bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirts
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girls
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 
一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书
 

Data Science in Production: Technologies That Drive Adoption of Data Science Solutions at JW Player

  • 1. Data Science in Production How Docker, K8s, and Airflow Drive Adoption of Data Science Solutions at JW Player Nir Yungster
  • 2. For this talk, we’ll cover technical approaches that can help drive adoption of data science solutions... But technology is not a remedy for everything (people, process,…) Disclaimer!
  • 3. Applied Data Science Is Driving Innovation Across Industries
  • 4.
  • 5.
  • 8. Data Science Requires Three Pieces to Succeed 1. Access to data 2. Effectiveness in research, development of solutions 3. Ability to deliver solutions when and where they’re needed
  • 9. Part I: The Challenge of Data Science in Production Part II: The Data Science Platform at JW Player Part III: Data Science in Production at JW Player 1 2 3 Agenda
  • 10. The Challenge of Data Science in Production Part I
  • 11. ● Model Performance ○ E.g. accuracy, precision, etc ● Production-Level Code ○ Portability ○ Maintainability ○ Scalability ○ Reliability What Does Production Data Science Mean? — Ease of deploying across environments — Testing, monitoring, documentation — Ability to handle high traffic volume — Service up-time
  • 12. Solution: Scientists and Engineers Collaborating
  • 13. Scientists and Engineers Collaborating I want model performance! I want model performance!
  • 14. I want accuracy, interpretability, & validation!! I want model performance! I want model performance! Scientist Engineer I want efficiency, reliability, & SLEEP!!
  • 15. Collaboration: The Good, the Bad, and The Ugly ● The Good ○ Positive collaboration ○ Both sides primary goals achieved ● The Bad ○ Models in Limbo ○ Mutant models ● The Ugly ○ Misunderstanding, distrust ○ Barriers between teams
  • 16. There are tools that can help! ● To make production data science more feasible ● To make Data Science teams more self sufficient ● To enable better collaboration across teams
  • 17. The Data Science Platform at JW Player Part II
  • 18. About JW Player ● Video player + platform ● Headquarters in NYC ● SaaS business ● 15k subscribers, 2M free ● 5% of video plays across the web
  • 19. ● Video Recommendation Engine Video Publisher Data Products ● Automated Thumbnail Selection ● Shot/Scene Detection
  • 20. ● Provide R&D for data products ● Centralized team (6 members) ○ Including 2 software developers ● Work with a variety of product and engineering teams across the company Data Science Within JW Player
  • 21. Key Elements of JW Data Science Infrastructure Container Service Workflow Orchestration Application Orchestration Scalability, Reliability Portability Maintainabiilty
  • 22. Docker is a Container Service What’s a container? ● A standard wrapper for tasks & applications so that they run consistently across environments
  • 23. ● Applications / tasks can run in any environment ● Removes friction arising from development and deployment in different environments ○ Across teams, within teams Container Portability Reduces Integration Pain dockerize all the things!
  • 24. Airflow Orchestrates Workflows ● Workflow consist of a series of tasks ○ E.g. data processing, model training ○ Workflows run on a schedule ● Airflow helps with Maintainability ○ Monitoring & alerting ○ Web interface for investigating logs, rerunning tasks / entire workflows
  • 25.
  • 26. ● Deploy & manage dockerized applications that run continuously (e.g. an API service) ● Built-in Scaling, Reliability, Monitoring ● JW Player maintains an internal deployment service powered by Kubernetes Kubernetes Orchestrates Applications
  • 27. Kubernetes Master Node Worker Node Kubernetes Basics Worker Node Worker Node Worker Node ● Application scaling made easy ○ Choose number of replicas ○ Scaling up is a configuration change 1 1 2 2 2 App.yaml Pod-1: Replicas: 2 Pod-2: Replicas: 3 2 ● Reliability ○ Master node monitors system ○ Ensures correct number of replicas
  • 28. Key Elements of JW Data Science Infrastructure Container Service Workflow Orchestration Application Orchestration Scalability, Reliability Portability Maintainabiilty
  • 29. Data Science in Production at JW Player Part III
  • 30. Three flavors of production data science ● Backend Microservices ○ Server-side API Running in Kubernetes ● Plugins (aka Frontend microservices) ○ Client-side plugin running alongside the Player ● “Integrations” with engineering ○ Data Science conducts R&D, develops a model ○ Works with Engineering to productionize
  • 31. Backend Microservice ● What is involved? ○ Deploy model as application on Kubernetes ○ Backend service with API ● When is this approach common? ○ Easiest for a new model ● Benefits ○ Data Science in full control of model, updates ○ Decoupled architecture ○ Clear ownership, boundaries Backend Frontend Microservice
  • 32. Client-side Plugin ● What is involved? ○ Effectively a client-side microservice ○ Written in JavaScript ● When is this approach common? ○ Easiest for a new model ○ If the model is lightweight ● Benefits ○ Decoupled architecture ○ Reduced network traffic, low latency Backend Frontend Plugin
  • 33. ● What is involved? ○ Translating / integrating model ○ Requires very close coordination ○ Often involves rewriting model code ● When is this approach common? ○ Often the case when you’re improving upon an existing product Integration with Engineering Backend Frontend Model ?? ● Possible Pitfalls ○ Tangled web ○ Unclear path to update/iterate
  • 35. Some Takeaways ● Owning models means more maintenance responsibility ○ Can take away from core DS mission ● Microservices don’t remove need to collaborate with other teams on models ○ To ensure feature fidelity ○ Ensure proper usage ○ SLAs
  • 36. ● Think about production from the beginning of R&D ● Build intelligent fallbacks to ease reliability concerns ○ When one element of a service fails, allowing for slightly degraded state (e.g. serving a stale model) ● Build a microservice that you jointly maintain with engineers ● Consider if your next hire should be a software engineer Some Tips
  • 37. Acknowledgement Graham Edge Nil Timor Olga Minkina Rik Heijdens Rob van Ejik